challenge_3
Mekhala Kumar
input output data
Degree and Density of a Network
Author

Mekhala Kumar

Published

March 25, 2023

Reading in data

Code
data<-read_xlsx("_data/got/US_input_output_table_2007_and_2012.xlsx",sheet="2012",skip=3)
New names:
• `` -> `...2`
Code
data<-data%>%
      rename(Code=`Commodity / Industry`,`Commodity Description` =`...2`)
data<-data %>%
  filter(!row_number() %in% c(1))
head(data)
# A tibble: 6 × 407
  Code   Commo…¹ Oilse…² Grain…³ Veget…⁴ Fruit…⁵ Green…⁶ Other…⁷ Dairy…⁸ Beef …⁹
  <chr>  <chr>   <chr>   <chr>     <dbl>   <dbl>   <dbl>   <dbl>   <dbl> <chr>  
1 1111A0 Oilsee… 5.2417… 1.3503… 2.37e-4 0       0       4.9 e-6  0      0      
2 1111B0 Grain … 0       6.6339… 0       0       0       1.53e-2  0.0450 7.0709…
3 111200 Vegeta… 0       0       3.03e-2 1.66e-4 0       0        0      0      
4 111300 Fruit … 0       0       0       1.15e-2 0       0        0      0      
5 111400 Greenh… 0       0       0       0       1.46e-1 0        0      0      
6 111900 Other … 6.3566… 7.1383… 0       0       2.52e-5 2.69e-2  0.0247 2.0728…
# … with 397 more variables: `Poultry and egg production` <dbl>,
#   `Animal production, except cattle and poultry and eggs` <chr>,
#   `Forestry and logging` <dbl>, `Fishing, hunting and trapping` <dbl>,
#   `Support activities for agriculture and forestry` <dbl>,
#   `Oil and gas extraction` <dbl>, `Coal mining` <dbl>,
#   `Copper, nickel, lead, and zinc mining` <dbl>,
#   `Iron, gold, silver, and other metal ore mining` <chr>, …

Converting into Network data format

I noticed that some of the columns did not match the observations under the Commodity Description column. I looked for the differences manually but was wondering if there is a method in R that can be used to do the same.

Code
adj<-data %>%
  filter(!row_number() %in% c(402:409))
adj<-adj%>%
  select(!(`State and local government electric utilities`))%>%
  select(!(`State and local government passenger transit`))%>%
  select(!(`Federal electric utilities`))%>%
  select(!(`Secondary smelting and alloying of aluminum`))
#state and govt electric utilities, state and local govt passenger transit, federal electric utilities, Secondary smelting and alloying of aluminum
adj<-subset(adj[3:403])
adj_mat <- data.matrix(adj)
table(adj_mat != 0)

FALSE  TRUE 
91289 69512 
Code
io_ig<-graph_from_adjacency_matrix(adj_mat, mode = "directed", weighted = TRUE, diag = FALSE)
io_stat<- network(adj_mat,matrix.type="adjacency", directed=TRUE,loops=FALSE)

Evaluation of the structure of the network

In this challenge, I have ignored the trade occuring within an industry (that is, the loops).

Code
vcount(io_ig)
[1] 401
Code
ecount(io_ig)
[1] 69148
Code
is_bipartite(io_ig)
[1] FALSE
Code
is_directed(io_ig)
[1] TRUE
Code
is_weighted(io_ig)
[1] TRUE
Code
print(io_stat)
 Network attributes:
  vertices = 401 
  directed = TRUE 
  hyper = FALSE 
  loops = FALSE 
  multiple = FALSE 
  bipartite = FALSE 
  total edges= 69148 
    missing edges= 0 
    non-missing edges= 69148 

 Vertex attribute names: 
    vertex.names 

 Edge attribute names not shown 
Code
igraph::dyad.census(io_ig)
$mut
[1] 16448

$asym
[1] 36252

$null
[1] 27500
Code
sna::dyad.census(io_stat)
       Mut  Asym  Null
[1,] 16448 36252 27500
Code
igraph::triad.census(io_ig)
 [1] 1458720 1171405  143979 1183985 1221064  136654  371468  571853 1127598
[10]    4079  480548  489333  749868  212507  967969  375570
Code
sna::triad.census(io_stat)
         003     012    102    021D    021U   021C   111D   111U    030T 030C
[1,] 1458720 1171405 143979 1183985 1221064 136654 371468 571853 1127598 4079
        201   120D   120U   120C    210    300
[1,] 480548 489333 749868 212507 967969 375570
Code
#Global
transitivity(io_ig, type="global")
[1] 0.7481592
Code
##Average local clustering coefficient
transitivity(io_ig, type="average")
[1] 0.8429528
Code
gtrans(io_stat)
[1] 0.7225683
Code
names(igraph::components(io_ig))
[1] "membership" "csize"      "no"        
Code
igraph::components(io_ig)$no 
[1] 1
Code
igraph::components(io_ig)$csize
[1] 401
Code
isolates(io_stat)
integer(0)

Degree

The average degree of the network is 344.9. The mean for both the indegree and outdegree are the same but the median is slightly lower for the outdegree than for the indegree. This means the nodes receive more links than send links. However, since the difference in the medians for the indegree and outdegree is only 15, it shows more of a mutual connection between the industries rather than a particular industry dominating.

Code
#igraph::degree(io_ig)
#sna::degree(io_stat)

nodes_ig<-data.frame(name=V(io_ig)$name, degree=igraph::degree(io_ig,loops=FALSE))
nodes_ig<-nodes_ig %>%
    mutate(indegree=igraph::degree(io_ig, mode="in", loops=FALSE),
           outdegree=igraph::degree(io_ig, mode="out", loops=FALSE))
head(nodes_ig)
                                                                                             name
Oilseed farming                                                                   Oilseed farming
Grain farming                                                                       Grain farming
Vegetable and melon farming                                           Vegetable and melon farming
Fruit and tree nut farming                                             Fruit and tree nut farming
Greenhouse, nursery, and floriculture production Greenhouse, nursery, and floriculture production
Other crop farming                                                             Other crop farming
                                                 degree indegree outdegree
Oilseed farming                                     484      400        84
Grain farming                                       492      400        92
Vegetable and melon farming                         183      102        81
Fruit and tree nut farming                          184       96        88
Greenhouse, nursery, and floriculture production    189       91        98
Other crop farming                                  224      117       107
Code
nodes_stat<-data.frame(name=io_stat%v%"vertex.names", degree=sna::degree(io_stat))
nodes_stat<-nodes_stat %>%
    mutate(indegree=sna::degree(io_stat, cmode="indegree"),
          outdegree=sna::degree(io_stat, cmode="outdegree"))

head(nodes_stat)
                                              name degree indegree outdegree
1                                  Oilseed farming    484      400        84
2                                    Grain farming    492      400        92
3                      Vegetable and melon farming    183      102        81
4                       Fruit and tree nut farming    184       96        88
5 Greenhouse, nursery, and floriculture production    189       91        98
6                               Other crop farming    224      117       107
Code
summary(nodes_ig)
     name               degree         indegree       outdegree    
 Length:401         Min.   : 68.0   Min.   :  0.0   Min.   : 67.0  
 Class :character   1st Qu.:208.0   1st Qu.:110.0   1st Qu.: 79.0  
 Mode  :character   Median :279.0   Median :129.0   Median :114.0  
                    Mean   :344.9   Mean   :172.4   Mean   :172.4  
                    3rd Qu.:485.0   3rd Qu.:164.0   3rd Qu.:251.0  
                    Max.   :799.0   Max.   :400.0   Max.   :399.0  
Code
summary(nodes_stat)
     name               degree         indegree       outdegree    
 Length:401         Min.   : 68.0   Min.   :  0.0   Min.   : 67.0  
 Class :character   1st Qu.:208.0   1st Qu.:110.0   1st Qu.: 79.0  
 Mode  :character   Median :279.0   Median :129.0   Median :114.0  
                    Mean   :344.9   Mean   :172.4   Mean   :172.4  
                    3rd Qu.:485.0   3rd Qu.:164.0   3rd Qu.:251.0  
                    Max.   :799.0   Max.   :400.0   Max.   :399.0  

Density

The density of the network is a global measure.

Code
graph.density(io_ig)
[1] 0.4310973
Code
network.density(io_stat)
[1] 0.4310973

Random Network

The dyad and triad census as well as the distributions differ from the input output network. However, the density of the network is the same.

Code
erdos_renyi_net <- erdos.renyi.game(401, 69148, type = "gnm", directed = TRUE)

Density of random network

Code
graph.density(erdos_renyi_net)
[1] 0.4310973

Dyad and Triad Census of Random network

Code
igraph::dyad.census(erdos_renyi_net)
$mut
[1] 14900

$asym
[1] 39348

$null
[1] 25952
Code
igraph::triad.census(erdos_renyi_net)
 [1]  361185 1644004  623689  623753  621886 1244376  943714  943402  946000
[10]  315548  358776  357667  357171  716768  540846   67815

Degree distributions of Random network

Code
nodes_ern<-data.frame(degree=igraph::degree(erdos_renyi_net))
nodes_ern<-nodes_ern %>%
    mutate(indegree=igraph::degree(erdos_renyi_net, mode="in", loops=FALSE),
           outdegree=igraph::degree(erdos_renyi_net, mode="out", loops=FALSE))
head(nodes_ern)
  degree indegree outdegree
1    335      166       169
2    376      180       196
3    348      173       175
4    327      158       169
5    334      169       165
6    344      174       170
Code
summary(nodes_ern)
     degree         indegree       outdegree    
 Min.   :304.0   Min.   :142.0   Min.   :148.0  
 1st Qu.:335.0   1st Qu.:165.0   1st Qu.:166.0  
 Median :345.0   Median :173.0   Median :173.0  
 Mean   :344.9   Mean   :172.4   Mean   :172.4  
 3rd Qu.:355.0   3rd Qu.:179.0   3rd Qu.:179.0  
 Max.   :380.0   Max.   :199.0   Max.   :202.0