DACSS 603 Final Project Pt 1

Author

Karen Kimble

Published

October 7, 2022

Code
# Setup

library(tidyverse)
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.3.6      ✔ purrr   0.3.5 
✔ tibble  3.1.8      ✔ dplyr   1.0.10
✔ tidyr   1.2.1      ✔ stringr 1.4.1 
✔ readr   2.1.3      ✔ forcats 0.5.2 
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
Code
library(readr)
library(scales)

Attaching package: 'scales'

The following object is masked from 'package:purrr':

    discard

The following object is masked from 'package:readr':

    col_factor
Code
# Importing datasets

NYC_2019 <- read_csv("_data/2018-2019_School_Demographic_Snapshot.csv", col_types = cols(`Grade PK (Half Day & Full Day)` = col_skip(), `# Multiple Race Categories Not Represented` = col_skip(), `% Multiple Race Categories Not Represented` = col_skip()))

NYC_2019$`% Poverty` <- percent(NYC_2019$`% Poverty`, accuracy=0.1)

NYC_2021 <- read_csv("_data/2020-2021_Demographic_Snapshot_School.csv", col_types = cols(`Grade 3K+PK (Half Day & Full Day)` = col_skip(), `# Multi-Racial` = col_skip(), `% Multi-Racial` = col_skip(), `# Native American` = col_skip(), `% Native American` = col_skip(), `# Missing Race/Ethnicity Data` = col_skip(), `% Missing Race/Ethnicity Data` = col_skip()))

# In order to bind the data, I had to remove columns that were not present in the other spreadsheet: Grade PK or 3K, Native American, the different multi-racial categories, and Missing Data

school_data <- rbind(NYC_2019, NYC_2021)

# Making values coded as "above 95%" to equal 95% and "below 5%" to equal 5% for the purposes of this analysis

school_data$`% Poverty` <- recode(school_data$`% Poverty`, "Above 95%" = "95%", "Below 5%" = "5%")

# Re-coding variables as numeric

school_data$`% Poverty` <- sapply(school_data$`% Poverty`, function(x) gsub("%", "", x))

school_data$`% Poverty` <- as.numeric(school_data$`% Poverty`)

school_data$`Economic Need Index` <- as.numeric(school_data$`Economic Need Index`)
Warning: NAs introduced by coercion

Research Question

The research question I want to explore is whether child poverty has increased in schools that are predominantly made up of non-white students from the 2014-2015 school year to the 2020-2021 school year. I think this is extremely important to look at because of the pandemic’s impact on not only child learning but also families’ economic resources. According to the Columbia University Center on Poverty and Social Policy, “nearly a quarter of children ages 0-3 live in poverty and nearly half of the city’s young children live in lower-opportunity neighborhoods where the poverty rate is at least 20 percent” (“Poverty”). Unfortunately, research shows that poverty is disproportionately felt according to one’s race or ethnicity. In New York State, as of 2021, child poverty among children of color is almost 30%, with Black or African American children more than twice as likely to live in poverty than White, Non-Hispanic children (“New York State”, 2021). With this disproportionate level of economic need in children of color, it seems important to investigate if the poverty level within New York City schools that are predominately non-White has increased significantly compared to schools that are predominantly White. When searching the UMass Libraries databases and other sources, it was hard to find studies that used this data in this way. It is important to understand if there is increasing poverty levels within an already vulnerable group.

Hypothesis

I hypothesize that the poverty rate in NYC schools that are predominantly children of color will have increased more between the 2014-2015 and the 2020-2021 school years than the poverty rate in schools that are predominantly White. Since I have not found many previous studies on this, it is hard to know if this hypothesis was tested before. However, this data is fairly recent and also relates to the pandemic’s effects on economics, so I think it is still a significant contribution to test this hypothesis.

Descriptive Statistics

A description and summary of your data. How was your data collected by its original collectors? What are the important variables of interest for your research question? Use functions like glimpse() and summary() to present your data.

The data was collected by New York City and put on its Open Data source. The data covers NYC schools in the academic years 2014-2015 to 2020-2021. The important variables of interest included in the data are:

  • Academic year

  • Number and percentage of Asisan, Black, Hispanic, and White students

  • Number and percentage of students in poverty

  • Economic need index, which is the average of students’ “Economic Need Values”

    • The Economic Need Index (ENI) estimates the percentage of students facing economic hardship

The other variables included are: DBN (district, borough, school number), school name, total enrollment, enrollment numbers for K-12, number and percentage of female and male students, number and percentage of students with disabilities, and number and percentage of English-Language Learner (ELL) students.

Code
glimpse(school_data)
Rows: 18,142
Columns: 36
$ DBN                            <chr> "01M015", "01M015", "01M015", "01M015",…
$ `School Name`                  <chr> "P.S. 015 Roberto Clemente", "P.S. 015 …
$ Year                           <chr> "2014-15", "2015-16", "2016-17", "2017-…
$ `Total Enrollment`             <dbl> 183, 176, 178, 190, 174, 270, 270, 271,…
$ `Grade K`                      <dbl> 27, 32, 28, 28, 20, 44, 47, 37, 34, 30,…
$ `Grade 1`                      <dbl> 47, 33, 33, 32, 33, 40, 43, 46, 38, 39,…
$ `Grade 2`                      <dbl> 31, 39, 27, 33, 30, 39, 41, 47, 42, 43,…
$ `Grade 3`                      <dbl> 19, 23, 31, 23, 30, 35, 43, 40, 46, 41,…
$ `Grade 4`                      <dbl> 17, 17, 24, 31, 20, 40, 35, 43, 42, 44,…
$ `Grade 5`                      <dbl> 24, 18, 18, 26, 28, 42, 40, 34, 42, 42,…
$ `Grade 6`                      <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ `Grade 7`                      <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ `Grade 8`                      <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ `Grade 9`                      <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ `Grade 10`                     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ `Grade 11`                     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ `Grade 12`                     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ `# Female`                     <dbl> 84, 83, 83, 99, 85, 132, 125, 127, 114,…
$ `% Female`                     <dbl> 0.459, 0.472, 0.466, 0.521, 0.489, 0.48…
$ `# Male`                       <dbl> 99, 93, 95, 91, 89, 138, 145, 144, 143,…
$ `% Male`                       <dbl> 0.541, 0.528, 0.534, 0.479, 0.511, 0.51…
$ `# Asian`                      <dbl> 8, 9, 14, 20, 24, 30, 27, 24, 23, 14, 2…
$ `% Asian`                      <dbl> 0.044, 0.051, 0.079, 0.105, 0.138, 0.11…
$ `# Black`                      <dbl> 65, 57, 51, 52, 48, 47, 55, 51, 49, 52,…
$ `% Black`                      <dbl> 0.355, 0.324, 0.287, 0.274, 0.276, 0.17…
$ `# Hispanic`                   <dbl> 107, 105, 105, 110, 95, 158, 169, 180, …
$ `% Hispanic`                   <dbl> 0.585, 0.597, 0.590, 0.579, 0.546, 0.58…
$ `# White`                      <dbl> 2, 2, 4, 6, 6, 27, 16, 15, 16, 18, 25, …
$ `% White`                      <dbl> 0.011, 0.011, 0.022, 0.032, 0.034, 0.10…
$ `# Students with Disabilities` <dbl> 64, 60, 51, 49, 38, 82, 82, 88, 90, 92,…
$ `% Students with Disabilities` <dbl> 0.350, 0.341, 0.287, 0.258, 0.218, 0.30…
$ `# English Language Learners`  <dbl> 17, 16, 12, 8, 8, 18, 13, 9, 8, 8, 120,…
$ `% English Language Learners`  <dbl> 0.093, 0.091, 0.067, 0.042, 0.046, 0.06…
$ `# Poverty`                    <chr> "169", "149", "152", "161", "145", "200…
$ `% Poverty`                    <dbl> 92.3, 84.7, 85.4, 84.7, 83.3, 74.1, 80.…
$ `Economic Need Index`          <dbl> 0.930, 0.889, 0.882, 0.890, 0.880, 0.60…
Code
summary(school_data)
     DBN            School Name            Year           Total Enrollment
 Length:18142       Length:18142       Length:18142       Min.   :   7.0  
 Class :character   Class :character   Class :character   1st Qu.: 323.0  
 Mode  :character   Mode  :character   Mode  :character   Median : 477.0  
                                                          Mean   : 592.3  
                                                          3rd Qu.: 695.0  
                                                          Max.   :6040.0  
                                                                          
    Grade K          Grade 1          Grade 2          Grade 3      
 Min.   :  0.00   Min.   :  0.00   Min.   :  0.00   Min.   :  0.00  
 1st Qu.:  0.00   1st Qu.:  0.00   1st Qu.:  0.00   1st Qu.:  0.00  
 Median : 32.00   Median : 33.00   Median : 32.00   Median : 28.00  
 Mean   : 44.25   Mean   : 45.79   Mean   : 45.73   Mean   : 45.33  
 3rd Qu.: 78.00   3rd Qu.: 81.00   3rd Qu.: 82.00   3rd Qu.: 81.00  
 Max.   :393.00   Max.   :383.00   Max.   :349.00   Max.   :369.00  
                                                                    
    Grade 4         Grade 5          Grade 6          Grade 7      
 Min.   :  0.0   Min.   :  0.00   Min.   :  0.00   Min.   :  0.00  
 1st Qu.:  0.0   1st Qu.:  0.00   1st Qu.:  0.00   1st Qu.:  0.00  
 Median : 22.0   Median : 19.00   Median :  0.00   Median :  0.00  
 Mean   : 44.8   Mean   : 44.18   Mean   : 43.15   Mean   : 42.37  
 3rd Qu.: 80.0   3rd Qu.: 80.00   3rd Qu.: 64.00   3rd Qu.: 62.00  
 Max.   :376.0   Max.   :351.00   Max.   :771.00   Max.   :796.00  
                                                                   
    Grade 8          Grade 9           Grade 10         Grade 11      
 Min.   :  0.00   Min.   :   0.00   Min.   :   0.0   Min.   :   0.00  
 1st Qu.:  0.00   1st Qu.:   0.00   1st Qu.:   0.0   1st Qu.:   0.00  
 Median :  0.00   Median :   0.00   Median :   0.0   Median :   0.00  
 Mean   : 41.88   Mean   :  49.34   Mean   :  48.7   Mean   :  39.85  
 3rd Qu.: 60.00   3rd Qu.:  68.00   3rd Qu.:  69.0   3rd Qu.:  54.00  
 Max.   :784.00   Max.   :1555.00   Max.   :3832.0   Max.   :1529.00  
                                                                      
    Grade 12          # Female         % Female          # Male      
 Min.   :   0.00   Min.   :   0.0   Min.   :0.0000   Min.   :   0.0  
 1st Qu.:   0.00   1st Qu.: 146.0   1st Qu.:0.4620   1st Qu.: 163.0  
 Median :   0.00   Median : 232.0   Median :0.4880   Median : 248.0  
 Mean   :  39.58   Mean   : 287.4   Mean   :0.4827   Mean   : 304.9  
 3rd Qu.:  53.00   3rd Qu.: 347.0   3rd Qu.:0.5130   3rd Qu.: 364.0  
 Max.   :1566.00   Max.   :2405.0   Max.   :1.0000   Max.   :3635.0  
                                                                     
     % Male          # Asian           % Asian          # Black      
 Min.   :0.0000   Min.   :   0.00   Min.   :0.0000   Min.   :   0.0  
 1st Qu.:0.4870   1st Qu.:   5.00   1st Qu.:0.0130   1st Qu.:  42.0  
 Median :0.5120   Median :  17.00   Median :0.0400   Median : 105.0  
 Mean   :0.5173   Mean   :  95.38   Mean   :0.1136   Mean   : 154.1  
 3rd Qu.:0.5380   3rd Qu.:  79.00   3rd Qu.:0.1400   3rd Qu.: 198.0  
 Max.   :1.0000   Max.   :3671.00   Max.   :0.9470   Max.   :1493.0  
                                                                     
    % Black        # Hispanic     % Hispanic        # White       
 Min.   :0.000   Min.   :   1   Min.   :0.0060   Min.   :   0.00  
 1st Qu.:0.083   1st Qu.:  89   1st Qu.:0.1980   1st Qu.:   6.00  
 Median :0.251   Median : 180   Median :0.3990   Median :  15.00  
 Mean   :0.316   Mean   : 241   Mean   :0.4251   Mean   :  87.24  
 3rd Qu.:0.502   3rd Qu.: 313   3rd Qu.:0.6323   3rd Qu.:  78.00  
 Max.   :0.987   Max.   :2056   Max.   :1.0000   Max.   :3190.00  
                                                                  
    % White       # Students with Disabilities % Students with Disabilities
 Min.   :0.0000   Min.   :  0.0                Min.   :0.0000              
 1st Qu.:0.0140   1st Qu.: 66.0                1st Qu.:0.1570              
 Median :0.0330   Median : 98.0                Median :0.2030              
 Mean   :0.1205   Mean   :121.6                Mean   :0.2295              
 3rd Qu.:0.1440   3rd Qu.:146.0                3rd Qu.:0.2540              
 Max.   :0.9450   Max.   :925.0                Max.   :1.0000              
                                                                           
 # English Language Learners % English Language Learners  # Poverty        
 Min.   :   0.0              Min.   :0.0000              Length:18142      
 1st Qu.:  18.0              1st Qu.:0.0430              Class :character  
 Median :  43.0              Median :0.0950              Mode  :character  
 Mean   :  81.1              Mean   :0.1363                                
 3rd Qu.: 100.0              3rd Qu.:0.1800                                
 Max.   :1219.0              Max.   :1.0000                                
                                                                           
   % Poverty      Economic Need Index
 Min.   :  2.90   Min.   :0.030      
 1st Qu.: 69.30   1st Qu.:0.579      
 Median : 81.40   Median :0.743      
 Mean   : 75.89   Mean   :0.691      
 3rd Qu.: 89.90   3rd Qu.:0.846      
 Max.   :100.00   Max.   :0.998      
                  NA's   :9169       
Code
# Note: the summary data for the enrollment numbers split by grade is somewhat off (especially minimums) because there is no variable listed for type of school (i.e., middle versus high school). So, for example, an elementary school would have an enrollment total of 0 for grade 12, which would show up as the minimum.

As we can see from this summary, the median percent of poverty in NYC schools (81.4%) is higher than the mean percent (75.89%), indicating that there may be low outliers with very low percentages of poverty. The same holds true for the Economic Need Index, with the mean (0.691) lower than the median (0.743). It is troubling, however, that both the mean and median percentages of poverty in NYC schools overall is more than three-fourths of the population.

References

New York State Child Poverty Facts. Schuyler Center for Analysis and Advocacy. (2021, February 18). Retrieved from https://scaany.org/wp-content/uploads/2021/02/NYS-Child-Poverty-Facts_Feb2021.pdf

Poverty in New York City. Columbia University Center on Poverty and Social Policy. (n.d.). Retrieved from https://www.povertycenter.columbia.edu/poverty-in-new-york-city#:~:text=Children%20and%20Families%20in%20New%20York%20City&text=Through%20surveys%2C%20we%20find%20that,is%20at%20least%2020%20percent.