Data Analytics and Computational Social Science: Community Detection

Santiago Virgüez

Recap affiliation and one-mode network

Let’s organize the data one more time. I’m gonna work again with the one-matrix (actorxactor). As you might remember, this one-mode matrix is a projection of the affiliation network (actorxcase), which means that the actors (the nodes) are tied by virtue of their participation in the same cases.

#Affiliation network data
library(igraph)
library(tidyr)
data <- read.csv("Cleaned_Data.csv")
##New column count
data$count <- 1

##Gather the data at case level
lev_data <- pivot_wider(data,id_cols = CaseID, names_from = `Name`, 
                        values_from = count, values_fn = list(count = length), 
                        values_fill = list(count = 0))

##transpose lev_data to have amici grouped by case
library(data.table)
T_lev_data <- transpose(lev_data,)
rownames(T_lev_data) <- colnames(lev_data)
colnames(T_lev_data) <- lev_data$CaseID
T_lev_data <- T_lev_data[-c(1),]

##create affiliation network graph using 'igraph'
Aff_network <- graph.incidence(T_lev_data)

############

#One-Mode matrix (actor x actor)

##extracting the one-mode projection
Aff_network.pr <- bipartite.projection(Aff_network)

##Actor x actor adjacency matrix

amici_net <- Aff_network.pr$proj1

amici_ad <- graph.adjacency(get.adjacency(amici_net, sparse = FALSE,attr = "weight"))

Before testing the different methods of community detection, it is important to remember that the one-mode matrix actor x actor is a projection from an affiliation network where amici actors are tied to other amici by virtue of their participation in the same case. What this means is that we expect that the communities among the amici actors correspond to the cases where they participate together, with some exceptions due to multiple participation of some actors in different cases.

Substantively, the community detection algorithms will not provide too much new information about clustering but it would be interesting to see the outcomes. We may expect that, because of the case clustering, each algorithm will provide a similar result in terms of the communities within the actor x actor network.

#Node data frame
amici.nodes<-data.frame(name=V(amici_ad)$name,
                              degree=igraph::degree(amici_ad),
                              degree.wt=strength(amici_ad),
                              betweenness=igraph::betweenness(amici_ad, directed=FALSE),
                              close=igraph::closeness(amici_ad),
                              constraint=constraint(amici_ad))

temp<-centr_eigen(amici_ad,directed=F)
amici.nodes$eigen<-temp$vector

Fast and Greedy Community Detection

comm.fg<-cluster_fast_greedy(as.undirected(amici_ad))
comm.fg

IGRAPH clustering fast greedy, groups: 46, mod: 0.74
+ groups:
  $`1`
   [1] "Rights International"                                                               
   [2] "The International Foundation for the Protection of Human Rights Defenders"          
   [3] "World Organisation Against Torture"                                                 
   [4] "Corporacion Colectivo de Abogados Jose Alvear Restrepo"                             
   [5] "Movimiento Nacional de\nDerechos Humanos"                                           
   [6] "Una Ventana a la Libertad"                                                          
   [7] "Comite de Familiares de Detenidos Desaparecidos"                                    
   [8] "Robert F. Kennedy Memorial Center for Human Rights"                                 
   [9] "Centro de Derechos Economicos y Sociales"                                           
  + ... omitted several groups/vertices

amici.nodes$comm.fg<-comm.fg$membership

plot(comm.fg,amici_ad, vertex.label=NA)

Walktrap Clustering

comm.wt<-walktrap.community(amici_ad)
comm.wt

IGRAPH clustering walktrap, groups: 59, mod: 0.75
+ groups:
  $`1`
  [1] "Human Rights Clinic of the Universidad de Palermo"                         
  [2] "Universidad Carlos III"                                                    
  [3] "World Press Freedom Committee"                                             
  [4] "Equal Rights Trust"                                                        
  [5] "Asylum and Human Rights Clinic of the Boston University School of Law"     
  [6] "Consejo Latinoamericano de Estudiosos de Derecho Internacional y Comparado"
  
  $`2`
   [1] "Amnesty International"                                              
  + ... omitted several groups/vertices

amici.nodes$comm.wt<-comm.wt$membership

plot(comm.wt,amici_ad, vertex.label=NA)

Leading Label Propagation Community Detection

comm.lab<-label.propagation.community(as.undirected(amici_ad))
comm.lab

IGRAPH clustering label propagation, groups: 57, mod: 0.75
+ groups:
  $`1`
  [1] "Fernando Linares"
  
  $`2`
   [1] "Amnesty International"                                              
   [2] "Legal Research Institute UNAM"                                      
   [3] "International Reproductive and Sexual Health Law Program"           
   [4] "University of Toronto Law School"                                   
   [5] "Women's Link Worldwide"                                             
   [6] "World Organization Against Torture"                                 
  + ... omitted several groups/vertices

amici.nodes$comm.wt<-comm.lab$membership

plot(comm.lab,amici_ad, vertex.label=NA)

Comparing Community Partitions

From what we saw in the figures, it is possible to say that all the partition methods detected similar communities. But to have a better sense of their differences, it is necessary to compare the outcomes.

mods<-c(fastgreedy=modularity(comm.fg), walktrap=modularity(comm.wt), walktrap=modularity(comm.lab))
mods

fastgreedy   walktrap   walktrap 
 0.7357490  0.7469115  0.7505719

As expected, all of the community detection methods seem to have similar value. Even when we compare them using different comparison methods, the results seem to be very close to each other:

compare.algs<-function(alg.a,alg.b,compare.meth=c("vi", "nmi", "split.join", "rand", "adjusted.rand")){
  #create list of community objects and methods
  comm.compare<-expand.grid(alg.a=alg.a, alg.b=alg.b, meth=compare.meth, result=NA, stringsAsFactors = FALSE)
  #compare community partitions using a loop
  for(i in 1:nrow(comm.compare)){
    comm1<-get(comm.compare$alg.a[i])
    comm2<-get(comm.compare$alg.b[i])
    method<-comm.compare$meth[i]
    comm.compare$result[i]<-compare(comm1, comm2, method)
  }
  return(comm.compare)
}

compare.algs(alg.a=c("comm.fg","comm.wt"),alg.b="comm.lab")

     alg.a    alg.b          meth     result
1  comm.fg comm.lab            vi  0.6598215
2  comm.wt comm.lab            vi  0.1560076
3  comm.fg comm.lab           nmi  0.8899333
4  comm.wt comm.lab           nmi  0.9761064
5  comm.fg comm.lab    split.join 95.0000000
6  comm.wt comm.lab    split.join 21.0000000
7  comm.fg comm.lab          rand  0.9529987
8  comm.wt comm.lab          rand  0.9943683
9  comm.fg comm.lab adjusted.rand  0.6931815
10 comm.wt comm.lab adjusted.rand  0.9520560

Comment on this article Share:

Community Detection

Recap affiliation and one-mode network

Fast and Greedy Community Detection

Walktrap Clustering

Leading Label Propagation Community Detection

Comparing Community Partitions

Reuse

Citation