Describing Network Data
I am working with a dataset accessed from the network package - Florentine Families. This dataset is already in a format suitable for network analysis (an adjancecy matrix) and ready to work with. The matrix shows the marriage links of different families in Florence. A 1 indicates the presence of a marriage link, while a 0 indicates the absence of a marriage link.
library(network)
library(igraph)
library(statnet)
data("flo")
network_adjacencey<- flo
network_adjacencey
Acciaiuoli Albizzi Barbadori Bischeri Castellani Ginori
Acciaiuoli 0 0 0 0 0 0
Albizzi 0 0 0 0 0 1
Barbadori 0 0 0 0 1 0
Bischeri 0 0 0 0 0 0
Castellani 0 0 1 0 0 0
Ginori 0 1 0 0 0 0
Guadagni 0 1 0 1 0 0
Lamberteschi 0 0 0 0 0 0
Medici 1 1 1 0 0 0
Pazzi 0 0 0 0 0 0
Peruzzi 0 0 0 1 1 0
Pucci 0 0 0 0 0 0
Ridolfi 0 0 0 0 0 0
Salviati 0 0 0 0 0 0
Strozzi 0 0 0 1 1 0
Tornabuoni 0 0 0 0 0 0
Guadagni Lamberteschi Medici Pazzi Peruzzi Pucci Ridolfi
Acciaiuoli 0 0 1 0 0 0 0
Albizzi 1 0 1 0 0 0 0
Barbadori 0 0 1 0 0 0 0
Bischeri 1 0 0 0 1 0 0
Castellani 0 0 0 0 1 0 0
Ginori 0 0 0 0 0 0 0
Guadagni 0 1 0 0 0 0 0
Lamberteschi 1 0 0 0 0 0 0
Medici 0 0 0 0 0 0 1
Pazzi 0 0 0 0 0 0 0
Peruzzi 0 0 0 0 0 0 0
Pucci 0 0 0 0 0 0 0
Ridolfi 0 0 1 0 0 0 0
Salviati 0 0 1 1 0 0 0
Strozzi 0 0 0 0 1 0 1
Tornabuoni 1 0 1 0 0 0 1
Salviati Strozzi Tornabuoni
Acciaiuoli 0 0 0
Albizzi 0 0 0
Barbadori 0 0 0
Bischeri 0 1 0
Castellani 0 1 0
Ginori 0 0 0
Guadagni 0 0 1
Lamberteschi 0 0 0
Medici 1 0 1
Pazzi 1 0 0
Peruzzi 0 1 0
Pucci 0 0 0
Ridolfi 0 1 1
Salviati 0 0 0
Strozzi 0 0 0
Tornabuoni 0 0 0
Following the script, I’ve created both a statnet and igraph network object from the dataset (adjancecy matrix).
network_statnet <- network(network_adjacencey, direct = FALSE)
network_igraph <- graph_from_adjacency_matrix(network_adjacencey, mode = "upper", weighted = NULL)
List of the objects available:
ls()
[1] "flo" "network_adjacencey" "network_igraph"
[4] "network_statnet"
We already know that the Florentine Families dataset is in the format of a matrix. In terms of network size, the network has 16 vertices or nodes (16 families) connected by 20 edges (representing ties of marriage, in this case).
print(network_statnet)
Network attributes:
vertices = 16
directed = FALSE
hyper = FALSE
loops = FALSE
multiple = FALSE
bipartite = FALSE
total edges= 20
missing edges= 0
non-missing edges= 20
Vertex attribute names:
vertex.names
No edge attributes
When it comes to network features, we learn from running the codes below that the network is unweighted/binary (with 0 and 1 inidicating the absence and presence of a marriage tie, respectively), undirected (meaning that the relationship between nodes is inherently symmetric, as marriage relationships are), and single/not bipartite.
is_weighted(network_igraph)
[1] FALSE
is_directed(network_igraph)
[1] FALSE
is_bipartite(network_igraph)
[1] FALSE
Network attributes:
network::list.vertex.attributes(network_statnet)
[1] "na" "vertex.names"
network::list.edge.attributes(network_statnet)
[1] "na"
Network structure comments
sna::dyad.census(network_statnet)
Mut Asym Null
[1,] 20 0 100
Since the ties are undirected, the dyad census command confirms the expected - that there are no assymetric edges. In ohter words, that all edges are reciprocal/mutual.
sna::triad.census(network_statnet, mode = "graph")
0 1 2 3
[1,] 324 195 38 3
sum(sna::triad.census(network_statnet, mode = "graph"))
[1] 560
The statnet command allowed me to indicate that the netwrok is undirected with the option mode=“graph”. There are 4 undirected triads, for a total number of 560.
How transitive are the relationships? What proportion of the connected triads are complete?
Local clustering (below)
transitivity(network_igraph, type = "average")
[1] 0.2181818
Global clustering (below)
transitivity(network_igraph, type = "global")
[1] 0.1914894
Local clustering is higher than global clustering, but overall low proportion of connected triads that are complete, I think. But I am having difficulties understanding/explaining what this means in this marriage dataset.
gtrans(network_statnet)
[1] 0.1914894
Average path length in whole network:
average.path.length(network_igraph, directed = F)
[1] 2.485714
names(igraph::components(network_igraph))
[1] "membership" "csize" "no"
igraph::components(network_igraph)$no
[1] 2
The network has 2 components…
igraph::components(network_igraph)$csize
[1] 15 1
… 1 of the components is larger, with 15 members and the other has one single member.
isolates(network_statnet)
[1] 12
There are 12 isolates in the network - 12 nodes that have no link to the rest of the network.
When we retrieve the names of the isolates, they all refer to the Pucci family, which is not linked by marriage to any of the other Florentine families.
Distill is a publication format for scientific and technical writing, native to the web.
Learn more about using Distill for R Markdown at https://rstudio.github.io/distill.
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Silva (2022, Feb. 17). Data Analytics and Computational Social Science: Short Assignment 2. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomecaetanoesil865861/
BibTeX citation
@misc{silva2022short, author = {Silva, Eunice C.}, title = {Data Analytics and Computational Social Science: Short Assignment 2}, url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomecaetanoesil865861/}, year = {2022} }