Code
library(tidyverse)
library(rtweet)
library(quanteda)
::opts_chunk$set(echo = TRUE) knitr
Nayan Jani
October 12, 2022
The first article I looked at talked about racial bias in Officials from the Italian Serie A. The goal of this study was to see if the trained Officials are subject to bias against Black and dark-skinned players and penalize them more than other players. The data contains information for each player in the Serie A from the 2009/10 season to the 2020/21 season. The study used three versions of the Football Manager videogame (Football Manager, 2011, 2018, 2021) to collect data on player skin tones, This skin tone variable is a continuous variable that ranges from 1, lightest skin tone, to 20, darkest skin tone. For red and yellow cards, the study used data from Footystats (2021) and data for fouls were available from WhoScored (2021) and from FBREF (2021). The main hypothesis of the study is that bias against darker-skinned players has likely resulted in unfair patterns of refereeing, including the distribution of a greater number of foul calls, yellow cards, and ejections (red cards). The methods usedin this study were OLS and Poisson Regression. The study found that skin tone does affect referee decisions, especially with respect to fouls committed and yellow cards, and more weakly with respect to red cards. Overall, I found this study interesting because it is looking into racial bias that actually effects the game. This shows that the racial stigmas are still a problem in sports and are effecting the integrity of the game.
The Second article I read discussed racial bias in National Football League officiating. The goal of this study was to examine potential racial bias regarding holding penalties in the National Football League (NFL). The conatains info from the 2013 to 2014 through 2015 to 2016 NFL seasons that includes the races of officials and players involved in holding penalties. The two types of analysis are used to determine racial bias, player-level analysis and a game-level analysis. The outcome of the player analysis is a dichotomous variable where it indicates a any combination of a white/black official calls a penalty on a white/black player. The dependent variable in the game-level analysis is the percentage of holding penalties called on Black players per game. The player-level analysis uses multinomial linear regression and the game-level analysis uses linear regression. The results showed no evidence of racial bias in the calling of holding penalties by White officials and Black players were more likely to have holding penalties called on them earlier in the game by all officials. Overall I found this article intersting because there is a lot of grey areas when calling a holding call and it is cool to see if racial bias has any effect on this type of call. If the study was able to determine a stronger relationship between racial bias and holding calls, it could lead to a more fair game and can remove a lot of bad calls.
The topic I want to look into is Sports Fans. I want to find out what groups of sports fans are more socially correct than others. What I mean by socially correct is that these groups of fans do not have any prejudice or enforce stigmas towards other groups of people. The groups of fans I would like to analyze are Soccer, NFL, NBA and UFC fans. To analyze this groups of fans, I will look into their textual responses of certain topics. For soccer fans I will look at their discussion about including LGBTQ in this years world cup in Qatar. For UFC I will look into the responses of fans to including certain fighters in their Hispanic heritage montage. For NFL. I will look at the responses of fans to the Deshaun Watson vs Calvin Ridley punishments. For NBA, I will look at the fans responses to the Ime Udoka vs Robert Sarver punishments. The data I will use will come from Youtube API. Most of the these fan discussions come from comments on Youtube and I believe analyzing the language they use will determine if certain groups of fans can be more socially correct.
Warning: One or more parsing issues, see `problems()` for details
Rows: 99 Columns: 1
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): I’ll try to get the next video essay out in less than a month lol
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Corpus consisting of 99 documents, showing 99 documents:
Text Types Tokens Sentences
text1 11 15 1
text2 16 17 3
text3 46 51 1
text4 13 13 1
text5 86 125 7
text6 3 3 1
text7 48 55 3
text8 18 20 1
text9 28 32 2
text10 20 27 2
text11 19 19 1
text12 18 18 1
text13 8 8 1
text14 18 25 1
text15 2 2 1
text16 48 59 4
text17 23 24 2
text18 73 97 2
text19 85 150 6
text20 51 69 3
text21 22 25 2
text22 26 28 4
text23 12 12 1
text24 23 26 1
text25 17 24 1
text26 34 40 2
text27 80 124 6
text28 14 14 2
text29 70 85 3
text30 14 14 2
text31 42 59 1
text32 51 70 2
text33 12 16 1
text34 6 6 1
text35 9 11 2
text36 12 12 1
text37 23 23 1
text38 26 32 1
text39 3 3 1
text40 5 5 1
text41 114 222 7
text42 20 21 2
text43 22 27 1
text44 29 33 2
text45 6 6 1
text46 22 25 4
text47 24 26 3
text48 81 109 1
text49 16 21 2
text50 16 31 3
text51 15 15 1
text52 26 32 1
text53 34 39 2
text54 9 11 1
text55 12 12 1
text56 6 6 1
text57 12 12 1
text58 2 2 1
text59 20 22 1
text60 54 77 2
text61 26 29 3
text62 7 7 1
text63 19 19 2
text64 4 6 1
text65 19 22 2
text66 58 77 3
text67 4 4 1
text68 17 24 1
text69 42 53 3
text70 15 19 1
text71 66 84 5
text72 1 1 1
text73 25 30 1
text74 17 17 1
text75 45 63 1
text76 11 11 1
text77 22 35 1
text78 46 64 4
text79 9 9 1
text80 23 28 3
text81 11 14 1
text82 51 59 2
text83 12 14 1
text84 7 7 1
text85 22 25 2
text86 85 125 2
text87 27 54 4
text88 9 9 1
text89 22 27 2
text90 33 41 1
text91 15 15 1
text92 8 8 1
text93 128 180 4
text94 6 6 1
text95 5 5 1
text96 71 94 3
text97 21 28 1
text98 98 179 10
text99 11 12 1
Warning: One or more parsing issues, see `problems()` for details
Rows: 98 Columns: 1
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): Thoughts on Malika and Stephen A having a disagreement?
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Corpus consisting of 98 documents, showing 98 documents:
Text Types Tokens Sentences
text1 97 178 6
text2 64 99 2
text3 33 37 1
text4 22 26 2
text5 7 7 1
text6 60 67 4
text7 14 14 2
text8 7 7 1
text9 7 10 2
text10 10 13 3
text11 27 27 1
text12 39 47 4
text13 23 27 3
text14 15 16 1
text15 5 6 1
text16 11 11 1
text17 45 55 2
text18 26 31 4
text19 19 24 1
text20 7 7 1
text21 54 87 2
text22 19 27 1
text23 39 47 1
text24 25 27 1
text25 31 37 1
text26 66 93 1
text27 5 5 1
text28 10 10 1
text29 9 16 2
text30 29 29 1
text31 16 18 1
text32 7 7 1
text33 25 32 1
text34 3 3 1
text35 29 39 2
text36 10 15 2
text37 5 5 1
text38 19 19 3
text39 104 158 7
text40 1 1 1
text41 27 32 3
text42 18 21 2
text43 26 34 1
text44 8 8 1
text45 3 3 1
text46 13 18 3
text47 11 11 2
text48 4 4 1
text49 39 51 2
text50 23 26 2
text51 26 33 5
text52 6 6 1
text53 16 16 2
text54 60 80 4
text55 19 22 4
text56 11 13 1
text57 11 16 2
text58 42 64 4
text59 14 19 2
text60 52 67 8
text61 20 21 2
text62 5 5 1
text63 82 125 8
text64 16 16 2
text65 21 25 3
text66 30 36 5
text67 23 25 1
text68 20 23 1
text69 22 27 1
text70 31 40 4
text71 64 94 2
text72 22 31 3
text73 35 42 2
text74 7 7 1
text75 5 5 1
text76 8 8 1
text77 10 10 1
text78 42 52 4
text79 14 14 1
text80 32 33 2
text81 5 5 1
text82 3 3 1
text83 18 19 1
text84 8 8 1
text85 37 45 4
text86 35 41 1
text87 13 14 1
text88 38 48 4
text89 39 48 8
text90 12 13 1
text91 7 9 1
text92 22 27 4
text93 8 12 2
text94 19 19 2
text95 22 23 1
text96 16 17 1
text97 24 27 1
text98 13 15 2
Warning: One or more parsing issues, see `problems()` for details
Rows: 99 Columns: 1
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): What crime did he commit?
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Corpus consisting of 99 documents, showing 99 documents:
Text Types Tokens Sentences
text1 4 4 1
text2 66 83 7
text3 44 52 4
text4 8 8 1
text5 24 26 3
text6 7 7 1
text7 21 23 1
text8 39 43 2
text9 8 8 1
text10 36 41 3
text11 26 28 2
text12 19 20 1
text13 8 8 1
text14 23 33 2
text15 29 35 5
text16 32 46 3
text17 26 30 3
text18 17 22 1
text19 5 5 1
text20 3 3 1
text21 21 29 1
text22 24 26 2
text23 21 23 3
text24 41 56 4
text25 14 14 1
text26 7 7 1
text27 20 20 1
text28 9 9 2
text29 8 9 1
text30 6 7 2
text31 31 43 2
text32 29 35 2
text33 20 24 1
text34 12 12 1
text35 8 8 2
text36 15 15 1
text37 20 21 2
text38 33 37 2
text39 10 10 1
text40 20 22 1
text41 10 10 1
text42 39 47 1
text43 15 15 1
text44 15 20 1
text45 65 82 2
text46 19 21 3
text47 12 12 2
text48 13 15 1
text49 10 10 1
text50 7 7 1
text51 24 26 1
text52 4 4 1
text53 23 27 2
text54 20 21 2
text55 19 21 3
text56 12 12 1
text57 73 92 5
text58 117 219 17
text59 12 15 2
text60 30 36 3
text61 57 73 6
text62 8 8 1
text63 26 30 3
text64 1 1 1
text65 19 20 2
text66 32 37 4
text67 7 7 1
text68 15 15 1
text69 60 82 1
text70 40 49 7
text71 4 4 1
text72 11 12 1
text73 91 125 7
text74 9 9 1
text75 13 13 1
text76 31 39 2
text77 18 19 1
text78 7 7 1
text79 6 9 1
text80 13 14 2
text81 44 56 5
text82 19 19 1
text83 9 9 1
text84 25 42 2
text85 22 26 3
text86 18 21 2
text87 37 43 1
text88 7 7 1
text89 22 22 1
text90 53 67 3
text91 9 9 1
text92 57 73 4
text93 80 138 9
text94 43 63 3
text95 21 25 1
text96 4 4 1
text97 23 25 2
text98 9 9 1
text99 9 9 1
Tokens consisting of 98 documents.
text1 :
[1] "I" "can" "#39" "t" "believe" "this"
[7] "is" "actually" "a" "debate" "in" "America"
[ ... and 126 more ]
text2 :
[1] "She" "acts" "like" "she" "is" "owed" "stuff" "br"
[9] "info" "that's" "none" "of"
[ ... and 63 more ]
text3 :
[1] "My" "question" "is" "why" "do" "they"
[7] "allow" "people" "like" "her" "to" "be"
[ ... and 23 more ]
text4 :
[1] "That" "is" "NOT" "why" "we" "are" "here"
[8] "I'm" "rolling" "The" "way" "she"
[ ... and 3 more ]
text5 :
[1] "Who" "did" "this" "man" "sleep" "with"
text6 :
[1] "I" "#39" "m" "with" "Candace" "Owens"
[7] "Basicslly" "shouldnt" "even" "hire" "women" "because"
[ ... and 48 more ]
[ reached max_ndoc ... 92 more documents ]
---
title: "Blog Post 2"
author: "Nayan Jani"
description: "Getting my data"
date: "10/12/2022"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- Blog
- Stigma
- Sports
- Nayan
---
```{r}
#| label: setup
#| warning: false
library(tidyverse)
library(rtweet)
library(quanteda)
knitr::opts_chunk$set(echo = TRUE)
```
## Literature Review
The first article I looked at talked about racial bias in Officials from the Italian Serie A. The goal of this study was to see if the trained Officials are subject to bias against Black and dark-skinned players and penalize them more than other players. The data contains information for each player in the Serie A from the 2009/10 season to the 2020/21 season. The study used three versions of the Football Manager videogame (Football Manager, 2011, 2018, 2021) to collect data
on player skin tones, This skin tone variable is a continuous variable that ranges from 1, lightest skin tone, to 20, darkest skin tone. For red and yellow cards, the study used data from Footystats (2021) and data for fouls were available from WhoScored (2021) and from FBREF (2021). The main hypothesis of the study is that bias against darker-skinned players has likely resulted in unfair patterns of refereeing, including the distribution of a greater number of foul calls, yellow cards, and ejections (red cards). The methods usedin this study were OLS and Poisson Regression. The study found that skin tone does affect referee decisions, especially with respect to fouls committed and yellow cards, and more weakly with respect to red cards. Overall, I found this study interesting because it is looking into racial bias that actually effects the game. This shows that the racial stigmas are still a problem in sports and are effecting the integrity of the game.
The Second article I read discussed racial bias in National Football League officiating. The goal of this study was to examine potential racial bias regarding holding penalties in the National Football League (NFL). The conatains info from the 2013 to 2014 through 2015 to 2016 NFL seasons that includes the races of officials and players involved in holding penalties. The two types of analysis are used to determine racial bias, player-level analysis and a game-level analysis. The outcome of the player analysis is a dichotomous variable where it indicates a any combination of a white/black official calls a penalty on a white/black player. The dependent variable in the game-level analysis is the percentage of holding penalties called on Black players per game. The player-level analysis uses multinomial linear regression and the game-level analysis uses linear regression. The results showed no evidence of racial bias in the calling of holding penalties by White officials and Black players were more likely to have holding penalties called on them earlier in the game by all officials. Overall I found this article intersting because there is a lot of grey areas when calling a holding call and it is cool to see if racial bias has any effect on this type of call. If the study was able to determine a stronger relationship between racial bias and holding calls, it could lead to a more fair game and can remove a lot of bad calls.
## My Project Idea
The topic I want to look into is Sports Fans. I want to find out what groups of sports fans are more socially correct than others. What I mean by socially correct is that these groups of fans do not have any prejudice or enforce stigmas towards other groups of people. The groups of fans I would like to analyze are Soccer, NFL, NBA and UFC fans. To analyze this groups of fans, I will look into their textual responses of certain topics. For soccer fans I will look at their discussion about including LGBTQ in this years world cup in Qatar. For UFC I will look into the responses of fans to including certain fighters in their Hispanic heritage montage. For NFL. I will look at the responses of fans to the Deshaun Watson vs Calvin Ridley punishments. For NBA, I will look at the fans responses to the Ime Udoka vs Robert Sarver punishments. The data I will use will come from Youtube API. Most of the these fan discussions come from comments on Youtube and I believe analyzing the language they use will determine if certain groups of fans can be more socially correct.
```{r}
df_q<- read_csv("_data/comments_q.csv")
df_q<- df_q %>%
rename(text = "I’ll try to get the next video essay out in less than a month lol")
corpus_q <- corpus(df_q)
corpusQ_sum <- summary(corpus_q)
corpusQ_sum
df_nba <- read_csv("_data/comments_nba.csv")
df_nba<- df_nba %>%
rename(text = "Thoughts on Malika and Stephen A having a disagreement?")
corpus_nba <- corpus(df_nba)
corpusNBA_sum <- summary(corpus_nba)
corpusNBA_sum
df_nfl <- read_csv("_data/comments_nfl.csv")
df_nfl<- df_nfl %>%
rename(text = "What crime did he commit?")
corpus_nfl <- corpus(df_nfl)
corpusNFL_sum <- summary(corpus_nfl)
corpusNFL_sum
```
```{r}
corpus_nba_tokens <- tokens(corpus_nba)
corpus_nba_tokens <- tokens(corpus_nba,
remove_punct = T,
remove_numbers = T,
remove_symbols =T)
print(corpus_nba_tokens)
```