Code
library(tidyverse)
library(readr)
::opts_chunk$set(echo = T) knitr
Alexis Gamez
March 18, 2023
The current state of video games supports massive online and local communities, generating millions in revenue a year. No longer can the perception of video games continue to be that of a counter-culture media, but something more widely accepted, definitively present, and enjoyed among diverse communities.
Stated eloquently by Nielsen, Smith & Tosca in, Understanding Video Games: The Essential Introduction (2012), No cultural form exists in isolation; rather, it is integrated within a complex system of meanings shaped by society and its institutions. Compared to other cultural forms, such as literature, the medium of the video game is a new member of this fascinating ecology. It is certainly true that the history of cultural media shows an almost instinctive skepticism leveled at new media. It has been true of radio, it has been true of movies, and it has certainly been true of television, which has long fought against the perception that its role was to entertain, rather than to enlighten.
Now, over 10 years later, we can see that the skepticism once surrounding video games has waned, so much so that engaging with them has now been accepted as a common hobby and even profession. I believe that video games are here to stay and rather than reject the new norm, we should grow with it. Video games will only continue to evolve in their application, especially when taking into consideration recent developments in virtual reality technology. Understanding and utilizing video games as the unique source of cultural media that they are can provide us insight into the nature of popular media and the very real weight of their market. What makes a game good? Does a game have to be good to be “successful”? Does fighting among platform communities contribute to the success, or lack there of, of a video game? Do critic scores play a role? How about user scores? Or maybe, it comes down to the publisher and whether they have a large enough budget to advertise their games to the masses? These are all questions I’d like to address during the scope of my project. However, the main objective I’ve tasked myself with, is to address the following research question:
Of the six selected variables (Platform, Genre, Publisher, Rating, Critic Scores & User Scores), what impact does each have on the commercial success of a video game?
While there exists a diverse range of articles and blog posts related to console wars, game of the year announcements, loot box market structures, etc. there has been a noticeable oversight regarding the correlation between public critique and generated revenue. An often overlooked predicament of video game development occurs post-release. A game may be applauded for all the characteristics the gaming community has grown to love, but if that game’s sales aren’t comparable to the costs it generated, then what reward is there for the developers? Would that game still be considered successful? Would it be lucrative for developers to switch to a more widely accepted genre or platform to guarantee economic success?
These questions have led to the development of the following hypothesis: As of 2017, independent variables “Platform” and “Genre” will have the most significant impact on Global sales.
From personal knowledge, it’s known that popularity by platform may fluctuate over the course of a series’ lifespan. The Nintendo developed Gamecube and Wii were widely popular upon release and are still commonly used today. The same can be said for Sony’s Playstation 1 & 2 or Microsoft’s Xbox 360. Similarly, there are also platforms like the Wii U, Playstation 3 and Xbox One that had tumultuous receptions upon release and led to committed users switching to other platform series. A common example would be the transition from Playstation to Xbox and vice versa. With this in mind, I hypothesize that Nintendo and Sony based platforms will have the highest impact on sales for video games based on prior knowledge concerning the success of select platforms like the Playstation 2, Gamecube and Wii.
When it comes to genre, the unrelenting success of the Call of Duty series within the time frame of this data serves as the core component of my belief that the genre Shooters will have the largest impact on sales among the video game genres. More and more shooters continue to be made in attempts to emulate the euphoria achieved when playing games like Call of Duty: Modern Warfare 2 and Black Ops. I also want to acknowledge other major successes such as Grand Theft Auto for example, that are major staples within their own given genres too (in the case of GTA, it’s the Role Playing genre). However, it’s known from personal experience that first person shooters revolutionized the gaming industry after the release of Call of Duty 4: Modern Warfare and I hypothesize that while while the modern gaming market may be over-saturated with shooters, they continue to play a large role in the commercial success of video games upon release.
This data set was pulled from the Kaggle online database and it’s description reads as follows, This data set contains a list of video games with sales greater than 100,000 copies along with critic and user ratings.
# A tibble: 6 × 15
Name Platform Year_of_Release Genre Publisher NA_Sales EU_Sales JP_Sales
<chr> <chr> <dbl> <chr> <chr> <dbl> <dbl> <dbl>
1 Wii Sports Wii 2006 Spor… Nintendo 41.4 29.0 3.77
2 Super Mar… NES 1985 Plat… Nintendo 29.1 3.58 6.81
3 Mario Kar… Wii 2008 Raci… Nintendo 15.7 12.8 3.79
4 Wii Sport… Wii 2009 Spor… Nintendo 15.6 11.0 3.28
5 Pokemon R… G 1996 Role… Nintendo 11.3 8.89 10.2
6 Tetris G 1989 Puzz… Nintendo 23.2 2.26 4.22
# … with 7 more variables: Other_Sales <dbl>, Global_Sales <dbl>,
# Critic_Score <dbl>, Critic_Count <dbl>, User_Score <dbl>, User_Count <dbl>,
# Rating <chr>
With this updated data set provided by the collector, we are given 15 variables and approximately 17,500 entries. The variables are as follows:
Referencing the data set’s description once again, it states that, It is a combined web scrape from VGChartz and Metacritic along with manually entered year of release values for most games with a missing year of release.
The original code the collector utilized was created by Rush Kirubi, but it’s made apparent that the original set limited the data to only include a subset of video game platforms. Additionally, not all the listed video games have information on Metacritic, so there are a significant amount of missing values under the critic & user scores/counts variables.
This provides valuable context concerning Metacritic, the forum utilized by critics and users to rate their favorite games, and the numerous missing values within the data set. Metacritic was established in 1999. As a result, all entries pre-dating early 2000 lack critic and user scores, as it had not been as well established at the time.
Name Platform Year_of_Release Genre
Length:17416 Length:17416 Min. :1976 Length:17416
Class :character Class :character 1st Qu.:2003 Class :character
Mode :character Mode :character Median :2008 Mode :character
Mean :2007
3rd Qu.:2011
Max. :2017
NA's :8
Publisher NA_Sales EU_Sales JP_Sales
Length:17416 Min. : 0.0000 Min. : 0.0000 Min. : 0.00000
Class :character 1st Qu.: 0.0000 1st Qu.: 0.0000 1st Qu.: 0.00000
Mode :character Median : 0.0700 Median : 0.0200 Median : 0.00000
Mean : 0.2545 Mean : 0.1407 Mean : 0.07502
3rd Qu.: 0.2300 3rd Qu.: 0.1000 3rd Qu.: 0.03000
Max. :41.3600 Max. :28.9600 Max. :10.22000
Other_Sales Global_Sales Critic_Score Critic_Count
Min. : 0.00000 Min. : 0.0100 Min. :13.00 Min. : 3.00
1st Qu.: 0.00000 1st Qu.: 0.0500 1st Qu.:60.00 1st Qu.: 11.00
Median : 0.01000 Median : 0.1600 Median :71.00 Median : 21.00
Mean : 0.04591 Mean : 0.5165 Mean :68.91 Mean : 26.19
3rd Qu.: 0.03000 3rd Qu.: 0.4500 3rd Qu.:79.00 3rd Qu.: 36.00
Max. :10.57000 Max. :82.5400 Max. :98.00 Max. :113.00
NA's :9080 NA's :9080
User_Score User_Count Rating
Min. :0.000 Min. : 4.0 Length:17416
1st Qu.:6.400 1st Qu.: 10.0 Class :character
Median :7.500 Median : 25.0 Mode :character
Mean :7.117 Mean : 162.7
3rd Qu.:8.200 3rd Qu.: 81.0
Max. :9.700 Max. :10766.0
NA's :9618 NA's :9618
Summarizing our data shows that 9,080 entries lack critic scores and 9,618 of them lack user scores. Even with 9,618 entries omitted, there are still over 7,700 complete entries to analyze and I do not fear that the omission will negatively impact the analysis.
Of the 15 variables provided, 11 will be heavily utilized throughout the scope of this project. 6 are to be considered independent variables and the remaining 5 will be dependent.
The 6 independent variables are as follows:
The 5 dependent variables are:
As stated previously, of the 6 independent variables, I believe that Genre and Platform will have the most significant impact on commercial success than any other of 4 remaining independent variables. However, I’d also like to add that I believe that the Critic Score variable will have little to no correlation with the commercial success of a video game.
Unfortunately, I did not have enough time prior to submitting this proposal to visualize and alter the data in more creative ways.
Moving forward, I would like to create histograms/boxplots that visualize the distribution of games according to genre and platform to show which categories seem the most significant. Ultimately, I’d like to do the same for all listed independent variables and observe whether or not they have an impact on sales. Finally, I’d also like to challenge my understanding of global success by attempting to analyze the effects the independent variables have on regional success as well.
Lastly, I would also like to make some alterations to the data set in order to more efficiently complete my analysis. I would like to re-code the platform variable so that devices are linked by platform series and not only by individual device (i.e. all variations of Playstation belong to Sony, all Xbox’s belong to Microsoft, etc.). I believe this will make it easier to visualize platform success over long periods of time. Similarly, I would like to re-code the Publisher variable into 3 separate groups, Major, Intermediate and independent. Each representing the scale and size of the referenced publishing studio which I believe may help determine the significance publishing studios have on sales.
Thank you for reading and have a wonderful day.
Egenfeldt-Nielsen, Simon, et al. Understanding Video Games : The Essential Introduction, Taylor & Francis Group, 2012. ProQuest Ebook Central, https://ebookcentral.proquest.com/lib/uma/detail.action?docID=1181119.
Etchells, Pete. Lost in a Good Game: Why We Play Video Games and What They Can Do for Us. Icon Books, 2019.
McCullough, Hayley. (2019). From Zelda to Stanley: Comparing the Integrative Complexity of Six Video Game Genres. Press Start. 5. 137-149.
Gillies, Kendall. “Video Game Sales and Ratings.” Kaggle, 25 Jan. 2017, https://www.kaggle.com/datasets/kendallgillies/video-game-sales-and-ratings?resource=download.
---
title: "Project Proposal"
author: "Alexis Gamez"
description: "DACSS 603 Project Proposal"
date: "03/18/2023"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- finalpart1
---
```{r}
#| label: setup
#| warning: false
library(tidyverse)
library(readr)
knitr::opts_chunk$set(echo = T)
```
# Introduction
The current state of video games supports massive online and local communities, generating millions in revenue a year. No longer can the perception of video games continue to be that of a counter-culture media, but something more widely accepted, definitively present, and enjoyed among diverse communities.
Stated eloquently by Nielsen, Smith & Tosca in, **Understanding Video Games: The Essential Introduction (2012)**,
*No cultural form exists in isolation; rather, it is integrated within a complex system of meanings shaped by society and its institutions. Compared to other cultural forms, such as literature, the medium of the video game is a new member of this fascinating ecology. It is certainly true that the history of cultural media shows an almost instinctive skepticism leveled at new media. It has been true of radio, it has been true of movies, and it has certainly been true of television, which has long fought against the perception that its role was to entertain, rather than to enlighten*.
Now, over 10 years later, we can see that the skepticism once surrounding video games has waned, so much so that engaging with them has now been accepted as a common hobby and even profession. I believe that video games are here to stay and rather than reject the new norm, we should grow with it. Video games will only continue to evolve in their application, especially when taking into consideration recent developments in virtual reality technology. Understanding and utilizing video games as the unique source of cultural media that they are can provide us insight into the nature of popular media and the very real weight of their market. What makes a game good? Does a game have to be good to be *"successful"*? Does fighting among platform communities contribute to the success, or lack there of, of a video game? Do critic scores play a role? How about user scores? Or maybe, it comes down to the publisher and whether they have a large enough budget to advertise their games to the masses? These are all questions I'd like to address during the scope of my project. However, the main objective I've tasked myself with, is to address the following research question:
**Of the six selected variables (Platform, Genre, Publisher, Rating, Critic Scores & User Scores), what impact does each have on the commercial success of a video game?**
# Hypothesis
While there exists a diverse range of articles and blog posts related to console wars, game of the year announcements, loot box market structures, etc. there has been a noticeable oversight regarding the correlation between public critique and generated revenue. An often overlooked predicament of video game development occurs post-release. A game may be applauded for all the characteristics the gaming community has grown to love, but if that game's sales aren't comparable to the costs it generated, then what reward is there for the developers? Would that game still be considered successful? Would it be lucrative for developers to switch to a more widely accepted genre or platform to guarantee economic success?
These questions have led to the development of the following hypothesis:
**As of 2017, independent variables "Platform" and "Genre" will have the most significant impact on Global sales**.
From personal knowledge, it's known that popularity by platform may fluctuate over the course of a series' lifespan. The Nintendo developed Gamecube and Wii were widely popular upon release and are still commonly used today. The same can be said for Sony's Playstation 1 & 2 or Microsoft's Xbox 360. Similarly, there are also platforms like the Wii U, Playstation 3 and Xbox One that had tumultuous receptions upon release and led to committed users switching to other platform series. A common example would be the transition from Playstation to Xbox and vice versa. With this in mind, I hypothesize that Nintendo and Sony based platforms will have the highest impact on sales for video games based on prior knowledge concerning the success of select platforms like the Playstation 2, Gamecube and Wii.
When it comes to genre, the unrelenting success of the Call of Duty series within the time frame of this data serves as the core component of my belief that the genre *Shooters* will have the largest impact on sales among the video game genres. More and more shooters continue to be made in attempts to emulate the euphoria achieved when playing games like Call of Duty: Modern Warfare 2 and Black Ops. I also want to acknowledge other major successes such as Grand Theft Auto for example, that are major staples within their own given genres too (in the case of GTA, it's the Role Playing genre). However, it's known from personal experience that first person shooters revolutionized the gaming industry after the release of Call of Duty 4: Modern Warfare and I hypothesize that while while the modern gaming market may be over-saturated with shooters, they continue to play a large role in the commercial success of video games upon release.
# Descriptive Statistics
## Description and Summary of the Data
This data set was pulled from the Kaggle online database and it's description reads as follows,
*This data set contains a list of video games with sales greater than 100,000 copies along with critic and user ratings*.
```{r, warning = F, message = F}
# reading in our data set
Video_Game_Sales <- read_csv("_data/final_project/Video_Game_Sales_as_of_Jan_2017.csv")
head(Video_Game_Sales)
```
With this updated data set provided by the collector, we are given 15 variables and approximately 17,500 entries. The variables are as follows:
- Name [game's name]
- Platform [platform of game release]
- Year of Release [game's release date]
- Genre [genre of game]
- Publisher [publisher of game]
- NA Sales [sales in North America in millions]
- EU Sales [sales in Europe in millions]
- JPN Sales [sales in Japan in millions]
- Other Sales [sales in rest of the world in millions]
- Global Sales [total worldwide sales in millions]
- Critic Score [aggregate score compiled by Metacritic staff]
- Critic Count [the number of critis used in creating the critic score]
- User Score [score according to Metacritic subscribers]
- User Count [number of users who gave the user score]
- Rating [ESRB rating for the game]
## How was the data collected?
Referencing the data set's description once again, it states that, *It is a combined web scrape from VGChartz and Metacritic along with manually entered year of release values for most games with a missing year of release*.
The original code the collector utilized was created by Rush Kirubi, but it's made apparent that the original set limited the data to only include a subset of video game platforms. Additionally, not all the listed video games have information on Metacritic, so there are a significant amount of missing values under the critic & user scores/counts variables.
This provides valuable context concerning Metacritic, the forum utilized by critics and users to rate their favorite games, and the numerous missing values within the data set. Metacritic was established in 1999. As a result, all entries pre-dating early 2000 lack critic and user scores, as it had not been as well established at the time.
```{r, warning = F, message = F}
# summarizing our data
summary(Video_Game_Sales)
```
Summarizing our data shows that 9,080 entries lack critic scores and 9,618 of them lack user scores. Even with 9,618 entries omitted, there are still over 7,700 complete entries to analyze and I do not fear that the omission will negatively impact the analysis.
## What are the important variables of interest?
Of the 15 variables provided, 11 will be heavily utilized throughout the scope of this project. 6 are to be considered independent variables and the remaining 5 will be dependent.
The 6 independent variables are as follows:
- Platform
- Genre
- Publisher
- Rating
- Critic Scores
- User Scores
The 5 dependent variables are:
- NA Sales
- EU Sales
- JPN Sales
- Other Sales
- Global Sales
As stated previously, of the 6 independent variables, I believe that Genre and Platform will have the most significant impact on commercial success than any other of 4 remaining independent variables. However, I'd also like to add that I believe that the Critic Score variable will have little to no correlation with the commercial success of a video game.
# Next Steps
Unfortunately, I did not have enough time prior to submitting this proposal to visualize and alter the data in more creative ways.
Moving forward, I would like to create histograms/boxplots that visualize the distribution of games according to genre and platform to show which categories seem the most significant. Ultimately, I'd like to do the same for all listed independent variables and observe whether or not they have an impact on sales. Finally, I'd also like to challenge my understanding of global success by attempting to analyze the effects the independent variables have on regional success as well.
Lastly, I would also like to make some alterations to the data set in order to more efficiently complete my analysis. I would like to re-code the platform variable so that devices are linked by platform series and not only by individual device (i.e. all variations of Playstation belong to Sony, all Xbox's belong to Microsoft, etc.). I believe this will make it easier to visualize platform success over long periods of time. Similarly, I would like to re-code the Publisher variable into 3 separate groups, Major, Intermediate and independent. Each representing the scale and size of the referenced publishing studio which I believe may help determine the significance publishing studios have on sales.
Thank you for reading and have a wonderful day.
### References
Egenfeldt-Nielsen, Simon, et al. Understanding Video Games : The Essential Introduction, Taylor & Francis Group, 2012. ProQuest Ebook Central, https://ebookcentral.proquest.com/lib/uma/detail.action?docID=1181119.
Etchells, Pete. Lost in a Good Game: Why We Play Video Games and What They Can Do for Us. Icon Books, 2019.
McCullough, Hayley. (2019). From Zelda to Stanley: Comparing the Integrative Complexity of Six Video Game Genres. Press Start. 5. 137-149.
Gillies, Kendall. “Video Game Sales and Ratings.” Kaggle, 25 Jan. 2017, https://www.kaggle.com/datasets/kendallgillies/video-game-sales-and-ratings?resource=download.