GOT Analysis and Visualization
Import GOT data set
GOT_data <- read_excel("/Users/shrutishelke1999/Downloads/GOTdata.xlsx")
head(GOT_data)
# A tibble: 6 × 11
order season episode character_killed killer method method_cat
<dbl> <dbl> <dbl> <chr> <chr> <chr> <chr>
1 1 1 1 Waymar Royce White Walker Ice s… Blade
2 2 1 1 Gared White Walker Ice s… Blade
3 3 1 1 Will Ned Stark Sword… Blade
4 4 1 1 Stag Direwolf Direw… Animal
5 5 1 1 Direwolf Stag Antler Animal
6 6 1 1 Jon Arryn Lysa Arryn Poison Poison
# … with 4 more variables: reason <chr>, location <chr>,
# allegiance <chr>, importance <dbl>
Number of Deaths in every Episode:
deaths_by_season <- GOT_data %>%
group_by(season, episode) %>%
summarise(count = n())
deaths_by_season$season<-sub("^","Season ",deaths_by_season$season)
deaths_by_season$episode<-sub("^","e",deaths_by_season$episode)
deaths_by_season
# A tibble: 69 × 3
# Groups: season [8]
season episode count
<chr> <chr> <int>
1 Season 1 e1 7
2 Season 1 e2 3
3 Season 1 e4 1
4 Season 1 e5 17
5 Season 1 e6 5
6 Season 1 e7 5
7 Season 1 e8 11
8 Season 1 e9 7
9 Season 1 e10 3
10 Season 2 e1 7
# … with 59 more rows
ggplot(data = deaths_by_season, aes(x = episode, y= count)) +
geom_bar(stat="identity") +
facet_wrap(vars(season), scales="free") +
theme_bw() +
labs(title="Total Deaths in 8 seasons")
Number of Deaths by Location:
death_location <- GOT_data %>% group_by(location) %>% summarise(count_deaths = n()) %>%
arrange(desc(count_deaths))
death_location
# A tibble: 42 × 2
location count_deaths
<chr> <int>
1 Winterfell 3709
2 King’s Landing 1357
3 Beyond the Wall 993
4 Meereen 154
5 Goldroad 116
6 Hardhome 99
7 The Twins 84
8 Castle Black 66
9 Narrow Sea 36
10 Riverlands 31
# … with 32 more rows
Maximum number of deaths in Winterfell in Season 8 during the battle of Winterfell
winterfell_battle <- GOT_data %>% filter(reason=="Killed during the Battle of Winterfell")
winterfell_battle
# A tibble: 2,278 × 11
order season episode character_killed killer method method_cat
<dbl> <dbl> <dbl> <chr> <chr> <chr> <chr>
1 2350 8 3 Wight Unknown Flaming t… Other
2 2351 8 3 Wight Unknown Flaming t… Other
3 2352 8 3 Wight Unknown Flaming t… Other
4 2353 8 3 Wight Unknown Flaming t… Other
5 2354 8 3 Wight Unknown Flaming t… Other
6 2355 8 3 Wight Unknown Flaming t… Other
7 2356 8 3 Wight Unknown Flaming t… Other
8 2357 8 3 Wight Unknown Flaming t… Other
9 2358 8 3 Wight Unknown Flaming t… Other
10 2359 8 3 Wight Unknown Flaming t… Other
# … with 2,268 more rows, and 4 more variables: reason <chr>,
# location <chr>, allegiance <chr>, importance <dbl>
Death By Importance / Status: Labels : 1 - Soldiers, Knight with least screen time 2 - Less Screen time but nobels or knights like Lannister cousins, Karstarks 3 - Advisors and close to the Lords like Ser Rodrik, Spice Kings beyond the Sea 4 - Main characters, Lords and Ladies of Kingdoms which include Ned Stark, Robert Baratheon, Khal Drogo
death_importance <- GOT_data %>% group_by(importance) %>% summarise(count_importance = n()) %>%
arrange(desc(count_importance))
death_importance
# A tibble: 5 × 2
importance count_importance
<dbl> <int>
1 1 6682
2 2 85
3 3 75
4 4 44
5 NA 1
Lets work on the main cast:
GOT_maincast <- GOT_data %>% filter(importance==4)
GOT_maincast <- GOT_maincast %>% separate(character_killed, c('Name', 'House')) %>% na.omit()
head(GOT_maincast)
# A tibble: 6 × 12
order season episode Name House killer method method_cat reason
<dbl> <dbl> <dbl> <chr> <chr> <chr> <chr> <chr> <chr>
1 33 1 6 Viserys Targar… Khal … Molte… Fire/Burn… Threa…
2 34 1 7 Robert Barath… Boar Tusk Animal Hunte…
3 56 1 9 Ned Stark Ilyn … Sword… Blade Execu…
4 58 1 10 Khal Drogo Daene… Pillow Household… Kille…
5 79 2 5 Renly Barath… Melis… Shado… Magic Kille…
6 199 3 4 Jeor Mormont Rast Knife Blade Attac…
# … with 3 more variables: location <chr>, allegiance <chr>,
# importance <dbl>
GOT_maincast %>% group_by(House) %>% summarise(count_house = n()) %>%
arrange(desc(count_house)) %>%
ggplot(aes(x=House, y=count_house)) +
geom_bar(stat="identity") +
scale_x_discrete(guide = guide_axis(n.dodge=4)) +
theme_bw() +
labs(title = "Death Count of House Leads")
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Prabhu (2022, May 19). Data Analytics and Computational Social Science: 601 HW 6. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomsshelke901520/
BibTeX citation
@misc{prabhu2022601, author = {Prabhu, Shruti Shelke and Snehal}, title = {Data Analytics and Computational Social Science: 601 HW 6}, url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomsshelke901520/}, year = {2022} }