KMuhammad_HW5 Pt.2

DACSS 601 - HW5: Update and Refinement

Kalimah Muhammad
2022-05-04

Update Part Two

Research Background and Questions
During the covid-19 pandemic, many businesses, institutions, and organizations across the globe reduced occupancy or closed their doors in an attempt to control the spread of the novel coronavirus, Covid-19. Likewise, schools across the globe made decisions on whether to continue in person learning or adopt distance learning practices. This project will review the status of school closures by country during the 2020-2021 covid-19 pandemic and what characteristics distinguished those who adopted the closures or not.

My research questions include:
Q1. How did the practice of school closures and re-openings unfold over the pandemic years of 2020 - 2021?
Q2. What characteristics, if any, by geography, country income level, student to teacher ratio, or access to distance learning modalities could be predictors of adopting similar measures in future events?

The data for this project was procured from UNESCO Institute of Statistics data on COVID-19 Education Response (sourced below). It contains daily school closure status for 210 countries/ territories from 2/16/2020 to 3/31/2022.

Show code
#Tidy Data
unesco[c('Reg2 Type', 'Regional Name')]<- str_split_fixed(unesco$`Region 2`, ': ', 2)
unesco[c('Reg3 Type', 'Income Level')]<- str_split_fixed(unesco$`Region 3`, ': ', 2)
#remove unnecessary columns and rearrange columns
unesco_fin<-unesco[c('Date','Country ID', 'Country', 'Regional Name', 'Income Level', 
                     'Status', 'Enrolment (Pre-Primary to Upper Secondary)',
                     'Teachers (Pre-Primary to Upper Secondary)', 
                     'School Age Population (Pre-Primary to Upper Secondary)',
                     'Distance learning modalities (TV)', 
                     'Distance learning modalities (Radio)',
                     'Distance learning modalities (Global)',
                     'Distance learning modalities (Online)', 'Weeks fully closed', 
                     'Weeks partially open')]
#add ratio of enrolled students to teachers
unesco_fin$Enrol_Teacher_Ratio <- unesco_fin$`Enrolment (Pre-Primary to Upper Secondary)`/ unesco_fin$`Teachers (Pre-Primary to Upper Secondary)`

Characteristics of Closed Locations

Q2: What characteristics, if any, by geographic location, country income level, student to teacher population size, or access to distance learning modalities were present in high school closures? In the last post, we saw geography played a role in the type and length of school closure status. However, is that variable impacted by country income level, student/teacher ratio, or access to distance learning technology? This next section will attempt to answer that question.

Show code
#Sankey Network Plot
#read in sankey file
links2<-read_xlsx("sankey.xlsx", "links")
nodes2<-read_xlsx("sankey.xlsx", "nodes")

#create sankey diagram
sankeyNetwork(Links= links2, Nodes= nodes2, Source= "Source", Target= "Target", Value="Value", NodeID= "name",
  LinkGroup = NULL, units = "",
  colourScale = JS("d3.scaleOrdinal(d3.schemeCategory20);"), fontSize = 10,
  fontFamily = NULL, nodeWidth = 15, nodePadding = 10, margin = NULL,
  height = 600, width = 900, iterations = 32, sinksRight = TRUE)

Observation: Here we see the following level of school closures: high- over 20 weeks (82), average- 16 to 20 weeks (31), low- under 16 weeks (85), or none- did not close fully (12). There is a slim margin between the number of countries with low school closures compared to high (85 to 82). Of the 82 countries listed as high level of closure, the following characteristics are present: 3 modes of distance learning (TV + Online + Radio) for 34 of 55 respondents, Online and TV modes (27 of 75 respondents),

Method of Characterizing Data

Average number of weeks fully closed

Show code
#Descriptive stats
#average number of weeks fully closed
unesco_short %>%
  group_by(`Regional Name`) %>%
  select(starts_with("Weeks")) %>%
  summarize_all(mean, na.rm = TRUE)
# A tibble: 7 x 3
  `Regional Name`                  `Weeks fully clo~` `Weeks partial~`
  <chr>                                         <dbl>            <dbl>
1 Africa (Sub-Saharan)                          18.1             13.3 
2 Asia (Central and Southern)                   24.4             27.8 
3 Asia (Eastern and South-eastern)              24.4             30.6 
4 Latin America and the Caribbean               29.6             32.3 
5 Northern America and Europe                   12.4             18   
6 Oceania                                        7.12             6.24
7 Western Asia and Northern Africa              24.6             22.0 
Show code
#descriptive stats of weeks fully closed
unesco_fin%>%
  summarise(
    mean.closed= mean(`Weeks fully closed`, na.rm=TRUE),
    median.closed= median(`Weeks fully closed`, na.rm=TRUE),
   IQR.closed= IQR(`Weeks fully closed`, na.rm=TRUE),
    sd.closed= sd(`Weeks fully closed`),
     var.closed= var(`Weeks fully closed`))
# A tibble: 1 x 5
  mean.closed median.closed IQR.closed sd.closed var.closed
        <dbl>         <dbl>      <dbl>     <dbl>      <dbl>
1        19.7            16         17      14.6       214.

Next, I compared the mean of weeks closed to the actual weeks fully closed to determine closure levels of:
* High levels are greater than or equal to 20 weeks (higher than mean value)
* Average levels between 16 and 20 weeks (between median and mean value)
* Low levels are less than 16 weeks (less than median value)

Count of School Closures by Level

Show code
#create column with mean of weeks fully closed
unesco_short2<-unesco_short%>%
  mutate(`Closure_Level`=`Weeks fully closed`)

#recode mean.closed with closure level
unesco_short2<-unesco_short2%>%
  mutate(Closure_Level = case_when(
    Closure_Level >= 20 ~ "High",
    Closure_Level >= 16 & Closure_Level < 20 ~ "Average",
    Closure_Level < 16 & Closure_Level > 0~ "Low",
    Closure_Level < 1 ~ "None"))

#Check work
table(select(unesco_short2, Closure_Level))

Average    High     Low    None 
     31      82      85      12 
Show code
#cross-tabulation of region by closure level
xtabs(~`Regional Name` + `Closure_Level`, unesco_short2)
                                  Closure_Level
Regional Name                      Average High Low None
  Africa (Sub-Saharan)                   7   17  23    1
  Asia (Central and Southern)            0    8   4    2
  Asia (Eastern and South-eastern)       1    8   7    0
  Latin America and the Caribbean        4   29   7    1
  Northern America and Europe           10    9  26    5
  Oceania                                0    1  13    3
  Western Asia and Northern Africa       9   10   5    0

Observation: As mentioned previously, Oceania and Northern America experienced mostly low school closures. However some regions are split between high and low numbers of weeks closed such as Asia (Central and Southern), Asia (Eastern and South-eastern). Alternatively some regions have primarily long weeks of closure such as Latin America and the Caribbean. Next further insight will be gathered from counrty income level to determine if this may accoutn for the varying within a region.

Country Income Level
The next table is income level distribution by country based on the World Bank country income groups. Note, there are 6 countries in which no data was captured: Anguilla, Cook Islands, Montserrat, Niue, Svalbard, and Tokelau. The below cross-tabulation displays country count by income level and regional name.

The graph below plots the distribution of school closures by income and region for each country.

Show code
#count of countries closure level by income
unesco_short2 %>%
  ggplot(aes(`Closure_Level`,`Income Level`, color= `Regional Name`))+
  geom_point(position="jitter")+
  scale_fill_brewer(palette = "Blues")+
  labs(title = "Level of School Closure by Income Level and Region")
Show code
#cross-tabulation of region by country income level
xtabs(~`Closure_Level` + `Income Level`, unesco_short2)
             Income Level
Closure_Level Low income Lower middle income Upper middle income
      None             2                   1                   3
      Low             10                  21                  12
      Average          6                   7                   8
      High            11                  21                  31
             Income Level
Closure_Level High income
      None              5
      Low              38
      Average          10
      High             18

Observation: Although the majority of high income countries (38 of 71) experience low down-time, upper middle income countries (31 of 54) had longer weeks closed than average. Low and lower middle income countries had comparable amounts of high and low levels of closures. Here it appears higher incomes may result in less time closed but this becomes less definitive as lower income countries adopt comparable levels of high and low school closures.

Enrolled Student to Teacher Ratio
Below is a summary of enrolled students to teachers for the entire data set. The average is approximately 20 students/ teacher across the globe. During the pandemic, social distancing and limited occupancy was a practice adopted to slow the spread of the virus. Thus, I will see what effect does this ratio have on the amount of time fully closed or partially open. Second, I will review how this ratio may be related to country income level as well as distance learning modality.

Show code
#summary of enrollment to teacher ratio
summary(unesco_fin$Enrol_Teacher_Ratio)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  6.437  12.895  16.951  20.043  25.268  62.971    2325 
Show code
#student ratio by closure level
#count of countries closure level by income
unesco_short2 %>%
  ggplot(aes(`Closure_Level`, `Enrol_Teacher_Ratio`, fill= `Closure_Level`))+
  geom_violin()+
  theme_minimal()+
  labs(title = "School Closure Level by Student to Teacher Ratio")

Observation: Student to Teacher Ratio does not appear as a significant determiner in school closure level.

Access to Distance Learning Technology

This section of plots summarizes the students’ access to distance learning modalities by country income level and number of weeks fully closed.

Summary of distance learning technology access by country.

The plot shows the country income level distribution by types of distance learning modes. The aim is to uncover if there are trends in income and the type and number of modalities available.

Show code
#bar chart of countries by distance learning mods
unesco_short2 %>%
  ggplot(aes(`Distance learning modalities (Global)`, fill= `Income Level`))+
  geom_bar(position = "fill")+
  scale_fill_brewer(palette = "Paired")+
  labs(y= "No. of Countries", title = "Count of Countries by Distance Learning Modalities")+
   guides(x = guide_axis(n.dodge = 2))
Show code
#cross-tabulation of learning mods by country income level
xtabs(~`Distance learning modalities (Global)` + `Income Level`, unesco_short)
                                     Income Level
Distance learning modalities (Global) High income Low income
                  None                         14          8
                  Online                       17          0
                  Online + Radio                0          1
                  Online + TV                  33          3
                  Radio                         0          5
                  TV                            1          1
                  TV + Online + Radio           6          9
                  TV + Radio                    0          2
                                     Income Level
Distance learning modalities (Global) Lower middle income
                  None                                  2
                  Online                                2
                  Online + Radio                        3
                  Online + TV                          16
                  Radio                                 0
                  TV                                    2
                  TV + Online + Radio                  20
                  TV + Radio                            5
                                     Income Level
Distance learning modalities (Global) Upper middle income
                  None                                  4
                  Online                                4
                  Online + Radio                        2
                  Online + TV                          23
                  Radio                                 0
                  TV                                    1
                  TV + Online + Radio                  20
                  TV + Radio                            0

Observation: Here we see low and lower middle income countries utilize radio and tv as primary and upper-middle income countries gravitating towards Online/TV and Online/TV/Radio modalities. These two modes appear the most popular among countries. Interestingly, we see high income countries have the highest representative of using no modality or online only, 19.7 and 23.4% of their respective total. Radio appears to be a tool used more in low and low-middle income countries.

Distribution of Distance Learning Access by Weeks Fully Closed

The goal here is to see if there is a relationship between types of access and the amount of weeks a school system is closed.

Show code
#Weeks fully closed by distance learning modality
#reorder modalities
unesco_short2$`Distance learning modalities (Global)` <- factor(unesco_short2$`Distance learning modalities (Global)`, levels=c("None", "Radio", "Online","TV", "TV + Radio", "Online + TV","Online + Radio", "TV + Online + Radio"))

unesco_short2 %>%
  mutate(`Distance learning modalities (Global)` = fct_reorder(`Distance learning modalities (Global)`, `Weeks fully closed`, .fun='mean')) %>%
  ggplot(aes(`Distance learning modalities (Global)`, `Weeks fully closed`))+
  geom_boxplot()+
  theme_minimal()+
  labs(title = "No. of Weeks Fully Closed by Distance Learning Modalities")+
   guides(x = guide_axis(n.dodge = 2))

Observation: Overall we see the majority of the distribution is between the - 40 weeks irrespective of the technology with the average around 20 weeks. Locations with none had the fewest weeks fully closed followed by those with Radio only. Alternatively, those with TV+Online+Radio saw longer school closures (especially among the outliers). This could suggest those with online access and 2 or more modes including one as online, experience longer weeks closed as distance learning technologies are available to teach and learn remotely.

Total number of weeks fully closed by region and country

The next graphic provides a detailed visual representation of the distribution of weeks closed by country and region. I’ve updated the charts to display in descending value to quickly identify outliers and trends within each region.

Show code
#Bar chart of number of weeks closed by country
allcountries_plot<-unesco_short %>%
  mutate(`Country` = fct_reorder(`Country`, `Weeks fully closed`)) %>%
  ggplot(aes(`Weeks fully closed`, Country, fill=`Regional Name`))+
  geom_col()+
  scale_fill_brewer(palette = "Set2")+
  labs(title = "School Closure by Country and Region")

#save ggplot
aspect_ratio <- 2.5
height <- 7
ggsave("country_plot.jpeg", allcountries_plot, device= jpeg, height = 30 , width = 7 * aspect_ratio)
Show code
knitr::include_graphics("country_plot.jpeg")
Bar chart of number of weeks closed by country

(#fig:ggplot all countries)Bar chart of number of weeks closed by country

Q2 Conclusion: Geography and access to two or more distance learning modalities, specifically if one is online, appear as factors to longer school closures during the covid-19 pandemic. Yet as a global communities, countries were able to minimize school closures regardless of country income or student to teacher ratio.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Muhammad (2022, May 4). Data Analytics and Computational Social Science: KMuhammad_HW5 Pt.2. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomkmuhamma898545/

BibTeX citation

@misc{muhammad2022kmuhammad_hw5,
  author = {Muhammad, Kalimah},
  title = {Data Analytics and Computational Social Science: KMuhammad_HW5 Pt.2},
  url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomkmuhamma898545/},
  year = {2022}
}