Week2_Challenge_Niharika Pola

challenge_1
instructions
Describing the Basic Structure of a Network
Author

Niharika Pola

Published

February 22, 2023

Challenge Overview

Describe the basic structure of a network following the steps in tutorial of week 2, this time using a dataset of your choice: for instance, you could use Marriages in Game of Thrones or Like/Dislike from week 1.

Another more complex option is the newly added dataset of the US input-output table of direct requirements by industry, availabe in the Bureau of Economic Analysis. Input-output tables show the economic transactions between industries of an economy and thus can be understood as a directed adjacency matrix. Data is provided in the form of an `XLSX` file, so using `read_xlsx` from package `readxl` is recommended, including the `sheet` as an argument (`2012` for instance).

Identify and describe content of nodes and links, and identify format of data set (i.e., matrix or edgelist, directed or not, weighted or not), and whether attribute data are present. Be sure to provide information about network size (e.g., information obtained from network description using week 1 network basic tutorial commands.)

Explore the dataset using commands from week 2 tutorial. Comment on the highlighted aspects of network structure such as:

- Geodesic and Path Distances; Path Length

- Dyads and Dyad Census

- Triads and Triad Census

- Network Transitivity and Clustering

- Component Structure and Membership

Be sure to both provide the relevant statistics calculated in `R`, as well as your own interpretation of these statistics.

Describe the Network Data

1. *List and inspect* List the objects to make sure the datafiles are working properly.

2. *Network Size* What is the size of the network? You may use `vcount` and `ecount`.

3. *Network features* Are these networks weighted, directed, and bipartite?

4. *Network Attributes* Now, using commands from either `statnet` or `igraph`, list the vertex and edge attributes.

Dyad and Triad Census

Now try a full dyad census. This gives us the number of dyads where the relationship is:

- Reciprocal (mutual), or `mut`

- Asymmetric (non-mutual), or `asym`, and

- Absent, or `null`

Now use `triad.census` in order to do a triad census.

Global and Local Transitivity or Clustering

Compute global transitivity using `transitivity` on `igraph` or `gtrans` on `statnet` and local transitivity of specific nodes of your choice, in addition to the average clustering coefficient. What is the distribution of node degree and how does it compare with the distribution of local transitivity?

Path Length and Component Structure

Can you compute the average path length and the _diameter_ of the network? Can you find the component structure of the network and identify the cluster membership of each node?

Code
#loading required libraries
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.2     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.2     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Code
library(dplyr)
library(igraph)

Attaching package: 'igraph'

The following objects are masked from 'package:lubridate':

    %--%, union

The following objects are masked from 'package:dplyr':

    as_data_frame, groups, union

The following objects are masked from 'package:purrr':

    compose, simplify

The following object is masked from 'package:tidyr':

    crossing

The following object is masked from 'package:tibble':

    as_data_frame

The following objects are masked from 'package:stats':

    decompose, spectrum

The following object is masked from 'package:base':

    union
Code
#Loading dataset
library(readr)
got_distances <- read_csv("_data/got/got_distances.csv")
Rows: 200 Columns: 6
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (5): Region From, From, To, Mode, Notes
dbl (1): Miles

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Code
head(got_distances)
# A tibble: 6 × 6
  `Region From` From          To               Miles Mode  Notes   
  <chr>         <chr>         <chr>            <dbl> <chr> <chr>   
1 Westerlands   Casterly Rock the Golden Tooth   240 land  <NA>    
2 Westerlands   Casterly Rock Lannisport          40 land  <NA>    
3 Westerlands   Casterly Rock Kayce              100 land  <NA>    
4 Westerlands   Casterly Rock Kayce               12 water <NA>    
5 Westerlands   Casterly Rock Deep Den           240 land  Goldroad
6 Westerlands   Deep Den      King’s Landing     590 land  Goldroad
Code
ls(got_distances)
[1] "From"        "Miles"       "Mode"        "Notes"       "Region From"
[6] "To"         
Code
#Network size can be determined by vcount and ecount
#Network features
net <- graph_from_data_frame(got_distances, directed = TRUE)
vcount(net)
[1] 103
Code
ecount(net)
[1] 200
Code
is_bipartite(net)
[1] FALSE
Code
is_weighted(net) 
[1] FALSE
Code
is_directed(net) 
[1] TRUE

There are 103 edges and 200 vertices

Code
# Network attributes
igraph::vertex_attr_names(net)
[1] "name"
Code
igraph::edge_attr_names(net)
[1] "To"    "Miles" "Mode"  "Notes"
Code
#Dyad census
igraph::dyad.census(net)
$mut
[1] 0

$asym
[1] 93

$null
[1] 5160

reciprocal dyads-0

asymetrics dyads-93

null dyads-5160

Code
#Triad Census
igraph::triad.census(net)
 [1] 167960   3917   4472    502      0      0      0      0      0      0
[11]      0      0      0      0      0      0
Code
#Global and Transitivity clustering
transitivity(net)
[1] 0
Code
transitivity(net, type = 'global')
[1] 0
Code
transitivity(net, type = 'average')
[1] 0
Code
# Pathlength and component structure
average.path.length(net,directed = T)
[1] 1
Code
igraph::components(net)$no
[1] 10
Code
igraph::components(net)$csize
 [1]  9 12  7 13  6 10 12 13  1 20