challenge_1
instruction
Game of Thrones
Erika_Nagai
Loading Data and Creating a Network
Author

Erika Nagai

Published

February 15, 2023

Code
library(igraph)
Warning: package 'igraph' was built under R version 4.2.2

Attaching package: 'igraph'
The following objects are masked from 'package:stats':

    decompose, spectrum
The following object is masked from 'package:base':

    union
Code
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:igraph':

    as_data_frame, groups, union
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
Code
library(tidyr)

Attaching package: 'tidyr'
The following object is masked from 'package:igraph':

    crossing
Code
library(statnet)
Warning: package 'statnet' was built under R version 4.2.2
Loading required package: tergm
Warning: package 'tergm' was built under R version 4.2.2
Loading required package: ergm
Warning: package 'ergm' was built under R version 4.2.2
Loading required package: network

'network' 1.18.0 (2022-10-05), part of the Statnet Project
* 'news(package="network")' for changes since last version
* 'citation("network")' for citation information
* 'https://statnet.org' for help, support, and other information

Attaching package: 'network'
The following objects are masked from 'package:igraph':

    %c%, %s%, add.edges, add.vertices, delete.edges, delete.vertices,
    get.edge.attribute, get.edges, get.vertex.attribute, is.bipartite,
    is.directed, list.edge.attributes, list.vertex.attributes,
    set.edge.attribute, set.vertex.attribute

'ergm' 4.4.0 (2023-01-26), part of the Statnet Project
* 'news(package="ergm")' for changes since last version
* 'citation("ergm")' for citation information
* 'https://statnet.org' for help, support, and other information
'ergm' 4 is a major update that introduces some backwards-incompatible
changes. Please type 'news(package="ergm")' for a list of major
changes.
Loading required package: networkDynamic
Warning: package 'networkDynamic' was built under R version 4.2.2

'networkDynamic' 0.11.2 (2022-05-04), part of the Statnet Project
* 'news(package="networkDynamic")' for changes since last version
* 'citation("networkDynamic")' for citation information
* 'https://statnet.org' for help, support, and other information
Registered S3 method overwritten by 'tergm':
  method                   from
  simulate_formula.network ergm

'tergm' 4.1.1 (2022-11-07), part of the Statnet Project
* 'news(package="tergm")' for changes since last version
* 'citation("tergm")' for citation information
* 'https://statnet.org' for help, support, and other information

Attaching package: 'tergm'
The following object is masked from 'package:ergm':

    snctrl
Loading required package: ergm.count
Warning: package 'ergm.count' was built under R version 4.2.2

'ergm.count' 4.1.1 (2022-05-24), part of the Statnet Project
* 'news(package="ergm.count")' for changes since last version
* 'citation("ergm.count")' for citation information
* 'https://statnet.org' for help, support, and other information
Loading required package: sna
Loading required package: statnet.common
Warning: package 'statnet.common' was built under R version 4.2.2

Attaching package: 'statnet.common'
The following object is masked from 'package:ergm':

    snctrl
The following objects are masked from 'package:base':

    attr, order
sna: Tools for Social Network Analysis
Version 2.7 created on 2022-05-09.
copyright (c) 2005, Carter T. Butts, University of California-Irvine
 For citation information, type citation("sna").
 Type help(package="sna") to get started.

Attaching package: 'sna'
The following objects are masked from 'package:igraph':

    betweenness, bonpow, closeness, components, degree, dyad.census,
    evcent, hierarchy, is.connected, neighborhood, triad.census
Loading required package: tsna
Warning: package 'tsna' was built under R version 4.2.2

'statnet' 2019.6 (2019-06-13), part of the Statnet Project
* 'news(package="statnet")' for changes since last version
* 'citation("statnet")' for citation information
* 'https://statnet.org' for help, support, and other information
unable to reach CRAN
Code
library(readr)
Warning: package 'readr' was built under R version 4.2.2

Challenge Overview

Today’s challenge is to

  1. read in a dataset, and

  2. create a network object

Load the Data

I read got_marriage.csv. This data shows the network of romantic relationship (marriage, engagement, affair) between families in the show “Game of Thrones”.

Game of Thrones Marriage data

got_marriages file looks like an edge list where each row represents an edge (from and to).

Code
got_marriage <- read_csv("../posts/_data/got/got_marriages.csv")
Rows: 255 Columns: 5
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (5): From, To, Type, Notes, Generation

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Code
head(got_marriage)
# A tibble: 6 × 5
  From      To      Type    Notes  Generation
  <chr>     <chr>   <chr>   <chr>  <chr>     
1 Targaryen Stark   Married R+L=J  Current   
2 Baratheon Martell Engaged died   Current   
3 Baratheon Stark   Engaged broken Current   
4 Martell   Essos   Married <NA>   Current   
5 Martell   Reach   Affair  <NA>   Current   
6 Martell   Essos   Affair  <NA>   Current   

There are multiple edges between the same families for example there are four marriages between Arryn family and Vale family. The number of marriages (or other types of relationship) should be recorded as an weight of edges.

Code
got_marriage <- got_marriage %>%
  group_by(From, To, Type) %>%
  summarize(weight=n())
`summarise()` has grouped output by 'From', 'To'. You can override using the
`.groups` argument.

Create a Network

Load the package igraph and create an igraph object (i.e. a graph or network) in the form of an edge list.

Code
library(igraph)

got.marriage.net <- graph_from_data_frame(got_marriage, directed=FALSE) #marriage is a symmetrical relationship so it should be indirected

This network is weighted and not directed. The edges of this network has two attributes, which are type (marriage/affair/engaged) and weight (the number of marriages/affairs/engagements).

Code
edge_attr_names(got.marriage.net)
[1] "Type"   "weight"

Plot a Network

The most simple network

This is a simple plot of Game of Thrones marriage network.

Code
plot.igraph(got.marriage.net,
            vertex.label.color = V(got.marriage.net)$type)

Code
# plot.igraph(got_distance.net, 
#             label.cex = .2,
#             vertex.size=0,
#             arrow.mode = "-",
#             vertex.label.color=V(got_distance.net)$color)

Network with the relationship types and the weight of edges

I visualized the type and the weight of the edges.

Septa family and Beyond Wall family are isolated from other families and they connect with other family (Martell and North, respectively) with one affair relationship. Tragaryen, Tyrell, Vale, and North families have many marriages within the same families.

Code
# Assign colors to the type

colors <- c(Married = "darkcyan",
            Engaged = "cyan",
            Affair = "brown3")

E(got.marriage.net)$color <- colors[match(E(got.marriage.net)$Type, names(colors))]

# Plot
plot(got.marriage.net,
     vertex.color = "light grey",
     edge.color=E(got.marriage.net)$color,
     edge.width=E(got.marriage.net)$weight*0.4,
     main = "Game of Thrones: Marriage Network")
legend(x = "bottomleft",
       legend = c("Married", "Engaged", "Affair"), fill=colors, title = "Type of relationship")

Cleaned Network

There are marriages within the same families, for example there are several marriages within the family North. The edges within the same families makes the whole network visualization looks more complicated. So I removed the marriages with in the same families.

This visualization shows that Tyrell & Reach families, Lannister & Waterlands, Frev & Vale familiesa are strongly connected with many marriages.

Code
got_marriage1 <- got_marriage %>% 
  filter(From != To)

got.marriage1.net <- graph_from_data_frame(got_marriage1, directed=FALSE)

E(got.marriage1.net)$color <- colors[match(E(got.marriage1.net)$Type, names(colors))]

# Plot
plot(got.marriage1.net,
     vertex.color = "light grey",
     edge.color=E(got.marriage1.net)$color,
     edge.width=E(got.marriage1.net)$weight*0.4,
     main = "Game of Thrones: Marriage Network")
legend(x = "bottomleft",
       legend = c("Married", "Engaged", "Affair"), fill=colors, title = "Type of relationship")