Describing the Network Data

From raw data to network data

Santiago Virgüez
2022-02-02

As mentioned in the previous post, I’m working with a new database on the IAcHR rulings compiled by the PluriCourts project of the University of Oslo (Stiansen, Naurin, and Bøyum 2020). This dataset of amici actors has 425 observations of 2 variables (“Name of Amicus” and “Case ID”), indicating in which cases an amici actor participated in:

A view of the dataset on amici
A view of the dataset on amici

From raw data to an adjacency matrix

In order to work with this dataset, I need to put the data into a format that is suitable for network analysis. So, after cleaning the data, I create an adjacency matrix (nxn) where I can see how many times interveners have filed an amicus brief in the same case than each other. I asigned “0” to the diagonal of the matrix to ignore the ties from a node to itself.

data <- read.csv("CleanedData.csv")

#New column count
data$count <- 1

library(tidyr)

#Gather the data at case level
lev_data <- pivot_wider(data,id_cols = CaseID, names_from = `Name.of.Amicus`, values_from = count, values_fn = list(count = length), values_fill = list(count = 0))

#Create the adjacency matrix
mat <- as.matrix(lev_data[-1])
ad_matrix <- t(mat) %*% mat
diag(ad_matrix) <- 0
Adjacency matrix
Adjacency matrix

Describing the network dataset

Before describing the network dataset, I need first to create a network object from the dataset loaded (the adjacency network):

library(igraph)
amici_network <- graph.adjacency(ad_matrix, mode = "undirected", weighted = TRUE)

is_directed(amici_network)
[1] FALSE
is_weighted(amici_network)
[1] TRUE
is_bipartite(amici_network)
[1] FALSE

Now, I can identify and describe content of nodes and links, and identify format of data set, Of course, from the previous steps we know that this a matrix, symmetric, and weighted. This means that each time there is a connection between two amici, each of them is filing a briefe before the court (symetrical), but these ties among amici are valued according to how many time they have intervened in the same cases.

We can also identify the network attributes:

#size
vcount(amici_network)
[1] 403
ecount(amici_network)
[1] 3615
#attributes names and content
vertex_attr_names(amici_network)
[1] "name"
edge_attr_names(amici_network)
[1] "weight"
head(V(amici_network)$name)
[1] "Fernando Linares Beltranena"                                                     
[2] "Amnesty International"                                                           
[3] "Association of the Bar of the City of New York"                                  
[4] "Lawyers Committee for Human Rights"                                              
[5] " The Central American Associaion of Families of Detained and Disappeared Persons"
[6] "Minnesota Lawyers International Human Rights Committee"                          
head(E(amici_network)$weight)
[1] 2 2 1 3 1 1

It is also possible to describe the network structure:

#Dyad census: because the ties are undirected we will expect that there is no assymetrical edges
igraph::dyad.census(amici_network)
$mut
[1] 3615

$asym
[1] 0

$null
[1] 77388
#Triad census
igraph::triad_census(amici_network)
 [1] 9455588       0 1331612       0       0       0       0       0
 [9]       0       0    2600       0       0       0       0   37601

In terms of transitivity or global clustering, it is possible to see how the proption of connected triads in the network of amici that are complete is hight, meaning that almost all connected triads are transitive. Likewise, the local clustering coefficient (emphasis on low degree nodes) confirms the high transitivity of the network:

#global clustering
transitivity(amici_network)
[1] 0.9774703
#local clustering coefficient
transitivity(amici_network, type="average")
[1] 0.9920321

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Virgüez (2022, Feb. 17). Data Analytics and Computational Social Science: Describing the Network Data. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpssvirguezgithubioichrnetworksposts2022-02-02-describing-the-network-data/

BibTeX citation

@misc{virgüez2022describing,
  author = {Virgüez, Santiago},
  title = {Data Analytics and Computational Social Science: Describing the Network Data},
  url = {https://github.com/DACSS/dacss_course_website/posts/httpssvirguezgithubioichrnetworksposts2022-02-02-describing-the-network-data/},
  year = {2022}
}