library(tidyverse)
library(readr)
library(ggplot2)
library(summarytools)
library(lubridate)
library(GGally)
library(dplyr)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
DACCS 601 Final Project
Introduction
YouTube is the most popular website in the world and we go to YouTube for nearly everything. Whether it’s entertainment, tutorials, or music there are a wide variety of reasons people use the platform. That leads to me to the ultimate question, what types of videos are the most popular? What is the biggest indicator of a higher view video?
I obtained my data set from Kaggle.com which was created by Rishav Sharma and is trending videos from August 11th, 2020 until April 10th, 2023 in the United States. It has the view counts of every video and it has separate cases for duplicate videos if it trended on multiple dates. Each row is essentially a given video trending on a certain date. It also has the category of the listed in numerical form and it has the amount of likes, dislikes, and the amount of comments it receives. When I mean this data set is huge I’m talking about 280.2 megabytes of data. So for the portion where we show off the top trending videos, we keep it to only the top performing videos so loading the dataset doesn’t crash R or make my computer explode.
YouTube has an algorithm on which videos should show up on your trending page so I was curious to see what variables correlate with the most successful YouTube videos. We will assume the most successful YouTube videos trend on the most days and have the highest amount of views.
Data
Let’s load up the data set
#loading my data set and naming it YouTube
<- read.csv("B:/Needels/Documents/DACCS 601/DACSS_601_New/posts/CamNeedels_FinalProjectData/US_youtube_trending_data.csv")
YouTube YouTube