An exploration of the Sampson’s Monks dataset.
In this assignment, I will be exploring the Sampson’s Monks dataset.
First, I utilized the provided Import Script shared in the Google Classroom to import the data and create the relevant data formats to interact with igraph and statnet.
#This script imports the sampson monk dataset from the ergm package.
#Let's load the libraries you need (install them first if you need to)
if("statnet" %in% rownames(installed.packages()) == FALSE) {install.packages("statnet")}
if("igraph" %in% rownames(installed.packages()) == FALSE) {install.packages("igraph")}
if("intergraph" %in% rownames(installed.packages()) == FALSE) {install.packages("intergraph")}
library(statnet)
library(igraph)
library(intergraph)
#Lets read the data into the enviroment. This will import it as a
data("sampson", package = "ergm")
network_statnet <- samplike
rm(samplike)
#Let's create an edgelist version
network_edgelist <- as.data.frame(as.edgelist(network_statnet))
network_edgelist$nominaations <- network_statnet%e%'nominations'
#Let's create a dataframe of node attributes
network_nodes <- data.frame(cloisterville = network_statnet%v%'cloisterville',
group = network_statnet%v%'group',
names = network_statnet%v%'vertex.names'
)
#Finaly, lets make an igraph version
network_igraph <- asIgraph(network_statnet)
Information about the network data can be accessed by the command: “?sampson”
First, using igraph:
dim(network_edgelist)
[1] 88 3
The dim() command tells us that we have a dataframe (called network_edgelist) which has 88 observations (rows) of 3 variables, which tells us that this (as the name indicates from the Import Script) an edgelist and not an adjacency matrix (which would be a square dataframe).
is_bipartite(network_igraph)
[1] FALSE
is_directed(network_igraph)
[1] TRUE
is_weighted(network_igraph)
[1] FALSE
From these commands, we learn that this dataset is not bipartite, it is directed, and it is not weighted.
vertex_attr_names(network_igraph)
[1] "cloisterville" "group" "na" "vertex.names"
From here we learn that our nodes have the following attributes (meaning, additional information available about each node): Cloisterville, Group, NA and vertex.names.
Note: we have also created a Nodes dataframe, which has three columns: cloisterville, group and names. It’s not clear to me what the NA attribute is.
edge_attr_names(network_igraph)
[1] "na" "nominations"
Our edge attributes include na and nominations. In this case, the nominations value is the “the number of times (out of 3) that monk A nominated monk B.”
We can also utilize the statnet package to learn about our network:
summary(network_statnet)
Network attributes:
vertices = 18
directed = TRUE
hyper = FALSE
loops = FALSE
multiple = FALSE
total edges = 88
missing edges = 0
non-missing edges = 88
density = 0.2875817
Vertex attributes:
cloisterville:
logical valued attribute
attribute summary:
Mode FALSE TRUE
logical 12 6
group:
character valued attribute
attribute summary:
Loyal Outcasts Turks
7 4 7
vertex.names:
character valued attribute
18 valid vertex names
Edge attributes:
nominations:
numeric valued attribute
attribute summary:
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 1.000 2.000 1.909 3.000 3.000
Network edgelist matrix:
[,1] [,2]
[1,] 5 1
[2,] 7 1
[3,] 1 2
[4,] 3 2
[5,] 12 2
[6,] 15 2
[7,] 1 3
[8,] 5 4
[9,] 1 5
[10,] 4 5
[11,] 6 5
[12,] 13 7
[13,] 9 8
[14,] 10 8
[15,] 11 8
[16,] 8 9
[17,] 10 9
[18,] 8 10
[19,] 14 12
[20,] 10 13
[21,] 18 13
[22,] 2 15
[23,] 16 15
[24,] 9 16
[25,] 18 17
[26,] 17 18
[27,] 2 1
[28,] 3 1
[29,] 6 1
[30,] 8 1
[31,] 12 1
[32,] 14 1
[33,] 15 1
[34,] 16 1
[35,] 18 1
[36,] 7 2
[37,] 8 2
[38,] 14 2
[39,] 16 2
[40,] 17 2
[41,] 18 2
[42,] 17 3
[43,] 18 3
[44,] 6 4
[45,] 8 4
[46,] 10 4
[47,] 11 4
[48,] 9 5
[49,] 10 5
[50,] 11 5
[51,] 13 5
[52,] 15 5
[53,] 4 6
[54,] 8 6
[55,] 2 7
[56,] 12 7
[57,] 15 7
[58,] 16 7
[59,] 18 7
[60,] 1 8
[61,] 7 8
[62,] 5 9
[63,] 6 9
[64,] 4 10
[65,] 4 11
[66,] 5 11
[67,] 14 11
[68,] 1 12
[69,] 2 12
[70,] 7 12
[71,] 9 12
[72,] 15 12
[73,] 16 12
[74,] 3 13
[75,] 5 13
[76,] 17 13
[77,] 1 14
[78,] 2 14
[79,] 10 14
[80,] 11 14
[81,] 12 14
[82,] 15 14
[83,] 14 15
[84,] 7 16
[85,] 11 16
[86,] 3 17
[87,] 3 18
[88,] 13 18
From here, we’ll run some assessments based on our Week 2 tutorial.
First, we’ll run a dyad census:igraph::dyad.census(network_igraph)
$mut
[1] 28
$asym
[1] 32
$null
[1] 93
There are 153 possible combinations of dyads in a group of 18 people. What this tells us that of those 153 combinations, only 28 are mutual (where A chooses B and B chooses A). Another 32 are assymmetic, meaning only one pair of the dyad chooses another, and 93, or more than 60% are null.
Next, we’ll look at a triad census (note: there are 816 possible triads in this network). We’ll confirm this:
sum(sna::triad.census(network_statnet, mode="graph"))
[1] 816
#Classify all triads in the network: statnet
#note: omit the 'mode' option for a directed network
sna::triad.census(network_statnet)
003 012 102 021D 021U 021C 111D 111U 030T 030C 201 120D 120U
[1,] 167 205 190 12 24 24 68 34 5 0 35 15 6
120C 210 300
[1,] 5 18 8
#get network transitivity: igraph
transitivity(network_igraph)
[1] 0.4646739
this measure states that about 46.5% of the triads in our network are connected. However, this is a directed network.
gtrans(network_statnet)
[1] 0.4074074
We can look at global vs local transitivity as well.
transitivity(network_igraph, type="global")
[1] 0.4646739
transitivity(network_igraph, type="average")
[1] 0.4925926
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Ph.D. (2022, Feb. 17). Data Analytics and Computational Social Science: Week 2 Assignment. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomlbateshaus862879/
BibTeX citation
@misc{ph.d.2022week, author = {Ph.D., Lissie Bates-Haus,}, title = {Data Analytics and Computational Social Science: Week 2 Assignment}, url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomlbateshaus862879/}, year = {2022} }