Exploration of the variables measuring the accountability of WA state Public k-12 Schools
The purpose of this research exploration is to gain an understanding of the variables that are affecting attendance rate for pk-12 students. The research shows that attendance rate is a predictor of graduation rate. In order to ensure we are graduating students we need to make sure they are first and foremost coming to school. If we can understand what populations are attending school and what institutions have a higher attendance rate. We will explore the possible effects of attendance rate on student academic performance and graduation rate as it pertains to the state of Washington.
#In preparation for summary and descriptive data of the variables mentioned above, I will need to rename them to remove the space in their name, filter out the text, and convert some variables from wholes numbers and decimals to just decimals.
#Renaming Variables
library("tidyverse")
WA_Edu_Improvment_2019 <- rename(WA_Edu_Improvment_2019, Attendance_Rate2="RegularAttendance Rate", Four_Year_Grad_Rate="Grad FourYear Rate", School_Type="School Type")
WA_Edu_Improvment_2019 <- rename(WA_Edu_Improvment_2019, District_Name="District Name")
WA_Edu_Improvment_2019 <- rename(WA_Edu_Improvment_2019, Student_Group="Student Group")
WA_Edu_Improvment_2019 <- rename(WA_Edu_Improvment_2019, School_Year="WSIF Year")
WA_Edu_Improvment_2019 <- rename(WA_Edu_Improvment_2019, DistrictID="District Organization Id")
#Filter anything that contains "Suppress" or "N" from the variable
library("stringr")
WA_Edu_Improvment_2019<-WA_Edu_Improvment_2019%>%
filter(!str_detect(Attendance_Rate2,"Suppress|N"))%>%
filter(!str_detect(Four_Year_Grad_Rate,"Suppress|N"))
view(WA_Edu_Improvment_2019)
#Convert Attendance Rate and Graduation Rate from a mix of whole numbers and decimals to just decimals
library("dplyr")
WA_Edu_Improvment_2019<-WA_Edu_Improvment_2019%>%
mutate(Attendance_Rate2 = parse_number(Attendance_Rate2),
Attendance_Rate2 = ifelse(Attendance_Rate2>1,Attendance_Rate2/100,Attendance_Rate2))
WA_Edu_Improvment_2019<-WA_Edu_Improvment_2019%>%
mutate(Four_Year_Grad_Rate = parse_number(Four_Year_Grad_Rate),
Four_Year_Grad_Rate = ifelse(Four_Year_Grad_Rate>1,Four_Year_Grad_Rate/100,Four_Year_Grad_Rate))
View(WA_Edu_Improvment_2019)
#summarizing with mean, min, and max
summarize(WA_Edu_Improvment_2019, mean_Attendance=mean(Attendance_Rate2), mean_Grad=mean(Four_Year_Grad_Rate),
min_Attendance=min(Attendance_Rate2), min_Grad=min(Four_Year_Grad_Rate),
max_Attendance=max(Attendance_Rate2), max_Grad=max(Four_Year_Grad_Rate))
# A tibble: 1 × 6
mean_Attendance mean_Grad min_Attendance min_Grad max_Attendance
<dbl> <dbl> <dbl> <dbl> <dbl>
1 0.682 0.735 0.053 0.0202 0.990
# … with 1 more variable: max_Grad <dbl>
Is there a relationship between attendance rate and graduation rate, as well as attendance rate and student performance data according to the state of Washington accountability data?
Here we will use scatter plots to view the relationship between graduation rate and attendance rate. We have included the variable of school year to identify if the relationship changes over time.
WA_Edu_Improvment_2019%>%
group_by(School_Year)%>%
ggplot(aes(Attendance_Rate2, Four_Year_Grad_Rate)) +
geom_point(aes(color = School_Year)) +
labs(title = "Attendance Rate and Graduation Rate Over Time", y = "Four-Year Graduation Rate", x = "Attendance Rate")
The scatterplot above shows us that at an attendance rate of about 56% or higher, there is a higher chance of graduation
To identify the school types with the highest attendance rates, we can look at the column graph below.
WA_Edu_Improvment_2019 %>%
group_by(School_Type)%>%
summarise(Average_Attendance_Rate=mean(Attendance_Rate2)) %>%
ggplot(aes(School_Type, Average_Attendance_Rate))+
geom_col(aes(fill = School_Type))+
theme_light()+
theme(axis.text.x = element_text(angle = 45)) +
labs(title = "Attendance Rate Distribution by School Type", x="School Type", y="Attendance Rate")
Special
Q
Public
To Identify the school types with the highest graduation rates, we can use the column graph below.
WA_Edu_Improvment_2019 %>%
group_by(School_Type)%>%
summarise(Average_Graduation_Rate=mean(Four_Year_Grad_Rate)) %>%
ggplot(aes(School_Type, Average_Graduation_Rate))+
geom_col(aes(fill = School_Type))+
theme_light()+
ylim(0,1) +
theme(axis.text.x = element_text(angle = 45)) +
labs(title = "Graduation Rate Distribution by School Type", x="School Type", y="Graduation Rate")
From the column graph above we can see that the school types with the highest graduation rates are:
Q
Public
Tribal Compact
From the three visualizations above we can conclude that there is a relationship between graduation rate and attendance rate. We can also see that the public school and Q type schools share highest graduation and attendance rates.
library(ggplot2)
WA_Edu_Improvment_2019 %>%
filter(!str_detect(Student_Group,"Low-Income|Students with Disabilities|All Students|English Language Learners"))%>%
ggplot(aes(fill=Student_Group, y=`School Code`, x=School_Type))+
geom_bar(position = "fill", stat = "identity")+
theme(axis.text.x = element_text(angle = 45),axis.text.y = element_text(angle = 45)) +
labs(title = "Student Demographic Distribution by School Type", x="School Type", y="Student Race/ Ethnicity")
Here we can see the distribution of student by demographic across school types
The column graph below shows us this distribution
WA_Edu_Improvment_2019%>%
filter(!str_detect(Student_Group,"Low-Income|Students with Disabilities|All Students|English Language Learners"))%>%
group_by(Student_Group)%>%
summarise(Average_Attendance_Rate=mean(Attendance_Rate2)) %>%
ggplot(aes(Student_Group, Average_Attendance_Rate)) +
geom_col(aes(fill = Student_Group))+
theme_light()+
theme(axis.text.x = element_text(angle = 45)) +
ylim(0,1)+
labs(title = "Attendance Rate Distribution by Student Demographic", x="Student Race/ Ethnicity", y="Attendance Rate")
This column graph shows us that students identifying as Asian students tend to have a higher average attendance rate then their peers. The groups that follow are students that identify as white and students identifying as two or more races, then students identifying as Black/ African American or Hispanic/ Latino.
WA_Edu_Improvment_2019%>%
filter(!str_detect(Student_Group,"Low-Income|Students with Disabilities|All Students|English Language Learners"))%>%
group_by(Student_Group)%>%
summarise(Average_Graduation_Rate=mean(Four_Year_Grad_Rate)) %>%
ggplot(aes(Student_Group, Average_Graduation_Rate)) +
geom_col(aes(fill = Student_Group))+
theme_light()+
theme(axis.text.x = element_text(angle = 45)) +
ylim(0,1)+
labs(title = "Graduation Rate Distribution by Student Demographic", x="Student Race/ Ethnicity", y="Attendance Rate")
The colum graph above shows us about the same trend as attendance rate, with slight variation showing that students who identify as multiple races tending to have a higher average graduation rate than their white peers.
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Derian-Toth (2022, April 11). Data Analytics and Computational Social Science: MADT_Homework 5. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscommderiantothdacss601hw5mdt/
BibTeX citation
@misc{derian-toth2022madt_homework, author = {Derian-Toth, Meredith}, title = {Data Analytics and Computational Social Science: MADT_Homework 5}, url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscommderiantothdacss601hw5mdt/}, year = {2022} }