Code
library(tidyverse)
library(ggplot2)
library(lubridate)
library(readxl)
library(hrbrthemes)
library(viridis)
library(ggpubr)
library(purrr)
library(plotly)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Theresa Szczepanski
December 12, 2022
Massachusetts Comprehensive Assessment System (MCAS) tests were introduced as part of the Massachusetts Education Reform Act in 1993 with the goal of providing all students with the skills and knowledge to thrive in a “complex and changing society” (Papay et. al, 2020 pp, 1). The MCAS tests are a significant tool for educational equity. Scores on the Grade 10 Math MCAS test “predict longer-term educational attainments and labor market success, above and beyond typical markers of student advantage. For example, among demographically similar students who attended the same high school and have the same level of ultimate educational attainment, those with higher MCAS mathematics scores go on to have much higher average earnings than those with lower scores.” (Papay et. al, 2020 pp 7-10)
In this report, I will analyze the Spring 2022 MCAS Results for students completing the High School Introductory Physics MCAS at Rising Tide Charter Public School.
The MCAS_2022
data frame contains performance results from 495 students from Rising Tide on the Spring 2022 Massachusetts Comprehensive Assessment System (MCAS) tests.
For each student, there are values reported for 256 different variables which consist of information from four broad categories
Demographic characteristics of the students themselves (e.g., race, gender, date of birth, town, grade level, years in school, years in Massachusetts, and low income, title1, IEP, 504, and EL status ).
Key assessment features including subject, test format, and accommodations provided
Performance metrics: This includes a student’s score on individual item strands, e.g.,sitem1
-sitem42
.
See the MCAS_2022
data frame summary and codebook in the appendix for further details.
The second data set, SG9_Item
, is \(42 \times 9\) and consists of 9 variables with information pertaining to the 42 questions on the 2022 HS Introductory Physics Item Report. The variables can be broken down into 2 categories:
Details about the content of a given test item:
This includes the content Reporting Category
(MF (motion and forces) WA (waves), and EN (energy)), the Standard
from the 2016 STE Massachusetts Curriculum Framework, the Item Description
providing the details of what specifically was asked of students, and the points available for a given question, item Possible Points
.
Summary Performance Metrics:
For each item, the state reports the percentage of points earned by students at Rising Tide, RT Percent Points
, the percentage of available points earned by students in the state, State Percent Points
, and the difference between the percentage of points earned by Rising Tide students and the percentage of points earned by students in the state, RT-State Diff
.
Lastly, SG9_CU306Dis
and SG9_CU306NonDis
are \(3 \times 5\) dataframes consisting of summary performance data by Reporting Category
for students with disabilities and without disabilities; most importantly including RT Percent Points
and State Percent Points
by disability status.
When considering our student performance data, we hope to address the following broad questions:
What adjustments (if any) should be made at the Tier 1 level, i.e., curricular adjustments for all students in the General Education setting?
What would be the most beneficial areas of focus for a targeted intervention course for students struggling to meet or exceed performance expectations?
Are there notable differences in student performance for students with and without disabilities?
To read in, tidy, and join our data frames for each content area we will use functions. In this library. I am also drafting some functions that I would use to scale up this project. There is still work to be done here.
#Item analysis Read in Function: Input: sheet_name, subject, grade; return: student item report for a given grade level and subject.
#subject must be: "math", "ela", or "science"
read_item<-function(sheet_name, subject, grade){
subject_item<-case_when(
subject == "science"~"sitem",
subject == "math"~"mitem",
subject == "ela"~"eitem"
)
if(subject == "science"){
read_excel("_data/2022MCASDepartmentalAnalysis.xlsx", sheet = sheet_name,
skip = 1, col_names= c(subject_item, "Type", "Reporting Category", "Standard", "item Desc", "delete", "item Possible Points","RT Percent Points", "State Percent Points", "RT-State Diff")) %>%
select(!contains("delete"))%>%
filter(!str_detect(sitem,"Legend|legend"))%>%
mutate(sitem= as.character(sitem))%>%
separate(c(1), c("sitem", "delete"))%>%
select(!contains("delete"))%>%
mutate(sitem =
str_c(subject_item, sitem))
}
else if(subject == "math" && grade < 10){
read_excel("_data/2022MCASDepartmentalAnalysis.xlsx", sheet = sheet_name,
skip = 1, col_names= c(subject_item, "Type", "Reporting Category", "Standard", "item Desc", "delete", "item Possible Points","delete","RT Percent Points", "State Percent Points", "RT-State Diff"))%>%
select(!contains("delete"))%>%
filter(!str_detect(mitem,"Legend|legend"))%>%
mutate(mitem = as.character(mitem))%>%
separate(c(1), c("mitem", "delete"))%>%
select(!contains("delete"))%>%
mutate(mitem =
str_c(subject_item, mitem))
}
else if(subject == "math" && grade == 10){
read_excel("_data/2022MCASDepartmentalAnalysis.xlsx", sheet = sheet_name,
skip = 1, col_names= c(subject_item, "Type", "Reporting Category", "Standard", "item Desc", "delete", "item Possible Points","RT Percent Points", "State Percent Points", "RT-State Diff"))%>%
select(!contains("delete"))%>%
filter(!str_detect(mitem,"Legend|legend"))%>%
mutate(mitem = as.character(mitem))%>%
separate(c(1), c("mitem", "delete"))%>%
select(!contains("delete"))%>%
mutate(mitem =
str_c(subject_item, mitem))
}
}
## MCAS Preliminary Results Read In
## Input file_path where the results csv file is stored, and the "year" the exam was administered
read_MCAS_Prelim<-function(file_path, year){read_csv(file_path,
skip=1)%>%
select(-c("sprp_dis", "sprp_sch", "sprp_dis_name", "sprp_sch_name", "sprp_orgtype",
"schtype", "testschoolname", "yrsindis", "conenr_dis"))%>%
#Recode all nominal variables as characters
mutate(testschoolcode = as.character(testschoolcode))%>%
#Include this line when using the non-private dataframe
# mutate(sasid = as.character(sasid))%>%
mutate(highneeds = as.character(highneeds))%>%
mutate(lowincome = as.character(lowincome))%>%
mutate(title1 = as.character(title1))%>%
mutate(ever_EL = as.character(ever_EL))%>%
mutate(EL = as.character(EL))%>%
mutate(EL_FormerEL = as.character(EL_FormerEL))%>%
mutate(FormerEL = as.character(FormerEL))%>%
mutate(ELfirstyear = as.character(ELfirstyear))%>%
mutate(IEP = as.character(IEP))%>%
mutate(plan504 = as.character(plan504))%>%
mutate(firstlanguage = as.character(firstlanguage))%>%
mutate(nature0fdis = as.character(natureofdis))%>%
mutate(spedplacement = as.character(spedplacement))%>%
mutate(town = as.character(town))%>%
mutate(ssubject = as.character(ssubject))%>%
#Recode all ordinal variable as factors
mutate(grade = as.factor(grade))%>%
mutate(levelofneed = as.factor(levelofneed))%>%
mutate(eperf2 = recode_factor(eperf2,
"E" = "Exceeding",
"M" = "Meeting",
"PM" = "Partially Meeting",
"NM"= "Not Meeting",
.ordered = TRUE))%>%
mutate(eperflev = recode_factor(eperflev,
"E" = "E",
"M" = "M",
"PM" = "PM",
"NM"= "NM",
"DNT" = "DNT",
"ABS" = "ABS",
.ordered = TRUE))%>%
mutate(mperf2 = recode_factor(mperf2,
"E" = "Exceeding",
"M" = "Meeting",
"PM" = "Partially Meeting",
"NM"= "Not Meeting",
.ordered = TRUE))%>%
mutate(mperflev = recode_factor(mperflev,
"E" = "E",
"M" = "M",
"PM" = "PM",
"NM"= "NM",
"INV" = "INV",
"ABS" = "ABS",
.ordered = TRUE))%>%
# The science variables contain a mixture of legacy performance levels and
# next generation performance levels which needs to be addressed in the ordering
# of these factors.
mutate(sperf2 = recode_factor(sperflev,
"E" = "Exceeding",
"M" = "Meeting",
"PM" = "Partially Meeting",
"NM"= "Not Meeting",
.ordered = TRUE))%>%
mutate(sperflev = recode_factor(sperf2,
"E" = "E",
"M" = "M",
"PM" = "PM",
"NM"= "NM",
"INV" = "INV",
"ABS" = "ABS",
.ordered = TRUE))%>%
#recode DOB using lubridate
mutate(dob = mdy(dob,
quiet = FALSE,
tz = NULL,
locale = Sys.getlocale("LC_TIME"),
truncated = 0
))%>%
mutate(IEP = case_when(
IEP == "1" ~ "Disabled",
IEP == "0" ~ "NonDisabled"
))%>%
mutate(year = year)
}
##Function for number of items table and graph
##ToDo Should a Function Produce Table and Graph?
##ToDo, Adjust the caption for test and year?
##ToDo, the Data Files need to be Updated to Include ELA reports
Subject_Cat_Total<-function(subject, subjectItemDF){
if(subject == "science"){subjectItemDF%>%
select(`sitem`, `item Possible Points`, `Reporting Category`)%>%
group_by(`Reporting Category`)%>%
summarise(available_points = sum(`item Possible Points`, na.rm=TRUE))%>%
mutate(percent_available_points = available_points/(sum(available_points, na.rm = TRUE)))%>%
ggplot(aes(x='',fill = `Reporting Category`, y = `available_points`)) +
geom_bar(position="fill", stat = "identity") + coord_flip()+
labs(subtitle ="All Students" ,
y = "% Points Available",
x= "Reporting Category",
title = "Percentage of Exam Points Available by Reporting Category",
caption = "2022 HS Introductory Physics MCAS")+
theme(axis.text.x=element_text(angle=60,hjust=1))
}
else if (subject == "math"){subjectItemDF%>%
select(`mitem`, `item Possible Points`, `Reporting Category`)%>%
group_by(`Reporting Category`)%>%
summarise(available_points = sum(`item Possible Points`, na.rm=TRUE))%>%
mutate(percent_available_points = available_points/(sum(available_points, na.rm = TRUE)))%>%
ggplot(aes(x='',fill = `Reporting Category`, y = `available_points`)) +
geom_bar(position="fill", stat = "identity") + coord_flip()+
labs(subtitle ="All Students" ,
y = "% Points Available",
x= "Reporting Category",
title = "Percentage of Exam Points Available by Reporting Category",
caption = "2022 HS Introductory Physics MCAS")+
theme(axis.text.x=element_text(angle=60,hjust=1))
} else if (subject == "ELA"){subjectItemDF%>%
select(`eitem`, `item Possible Points`, `Reporting Category`)%>%
group_by(`Reporting Category`)%>%
summarise(available_points = sum(`item Possible Points`, na.rm=TRUE))%>%
mutate(percent_available_points = available_points/(sum(available_points, na.rm = TRUE)))%>%
ggplot(aes(x='',fill = `Reporting Category`, y = `available_points`)) +
geom_bar(position="fill", stat = "identity") + coord_flip()+
labs(subtitle ="All Students" ,
y = "% Points Available",
x= "Reporting Category",
title = "Percentage of Exam Points Available by Reporting Category",
caption = "2022 ELA MCAS")+
theme(axis.text.x=element_text(angle=60,hjust=1))
}
}
# testDF<-read_item("SG9Physics", "science")
# #view(testDF)
# Subject_Cat_Total("science", testDF)
Student_Item<-function(subject, gradeLevel, subjectItemDF, studentPerfDF){
if(subject == "science"){
select( studentPerfDF, contains("sitem"), gender, grade, yrsinsch,
race, IEP, `plan504`, sattempt, sperflev, sperf2, sscaleds)%>%
filter((grade == gradeLevel) & sattempt != "N")%>%
pivot_longer(contains("sitem"), names_to = "sitem", values_to = "sitem_score")%>%
left_join(subjectItemDF, "sitem")
}
if(subject == "math"){
select( studentPerfDF, contains("mitem"), gender, grade, yrsinsch,
race, IEP, `plan504`, mattempt, mperflev, mperf2, mscaleds)%>%
filter((grade == gradeLevel) & mattempt != "N")%>%
pivot_longer(contains("mitem"), names_to = "mitem", values_to = "mitem_score")%>%
left_join(subjectItemDF, "mitem")
}
####ToDo, update departmental analysis data to include ELA item reports
if(subject == "ela"){
select( studentPerfDF, contains("eitem"), gender, grade, yrsinsch,
race, IEP, `plan504`, eattempt, eperflev, eperf2, escaleds)%>%
filter((grade == gradeLevel) & eattempt != "N")%>%
pivot_longer(contains("eitem"), names_to = "eitem", values_to = "eitem_score")%>%
left_join(subjectItemDF, "eitem")
}
}
# TestMCAS_2022<-read_MCAS_Prelim("_data/PrivateSpring2022_MCAS_full_preliminary_results_04830305.csv",2022)
# SG5_Item<-read_item("SG5", "science", 5)
# SG5_Student_Item<-Student_Item("science", 5, SG5_Item, TestMCAS_2022)
# SG5_Student_Item
# TestMCAS_2022<-read_MCAS_Prelim("_data/PrivateSpring2022_MCAS_full_preliminary_results_04830305.csv",2022)
# MG5_Item<-read_item("MG5", "math", 5)
# MG5_Student_Item<-Student_Item("math", 5, MG5_Item, TestMCAS_2022)
# MG5_Student_Item
keyWord<-function(subjectItemDF, subject, keyWord){
keyWord<-str_to_lower(keyWord)
keyWordFirst<-str_to_upper(str_sub(keyWord, 1L,1L))
keyWordEnd<-str_sub(keyWord, 1L+1, -1L)
keyWordCap<-str_c(keyWordFirst, keyWordEnd)
if (subject == "science"){
select(subjectItemDF,`sitem`, `item Desc`,`item Possible Points`, `Reporting Category`, `State Percent Points`, `RT-State Diff`)%>%
mutate( key_word = case_when(
!(str_detect(`item Desc`, keyWord)|str_detect(`item Desc`,keyWordCap)) ~ str_c("Non-", keyWordCap),
str_detect(`item Desc`, keyWord)|str_detect(`item Desc`,keyWordCap) ~ keyWordCap))
}
else if (subject == "math"){
select(subjectItemDF, `mitem`, `item Desc`,`item Possible Points`, `Reporting Category`, `State Percent Points`, `RT-State Diff`)%>%
mutate( key_word = case_when(
!(str_detect(`item Desc`, keyWord)|str_detect(`item Desc`,keyWordCap)) ~ str_c("Non-", keyWordCap),
str_detect(`item Desc`, keyWord)|str_detect(`item Desc`,keyWordCap) ~ keyWordCap))
}
}
#view(SG9_Calc)
# MG8_Item<-read_item("MG8", "math", 8)
# MG5_Item
# MG8_Describe<-keyWord(MG8_Item, "math", "determine")
# MG8_Describe
# SG8_Item<-read_item("SG8", "science", 8)
# SG8_Item
# SG8_Calc<-keyWord(SG8_Item, "science", "calculate")
# SG8_Calc
Introductory Physics, SG9_Item
Read-In
Introductory Physics, SG9_CU306Dis
Read-In
SG9_CU306Dis<-read_excel("_data/MCAS CU306 2022/CU306MCAS2022PhysicsGrade9ByDisability.xlsm",
sheet = "Disabled Students",
col_names = c("Reporting Category", "Possible Points", "RT%Points",
"State%Points", "RT-State Diff"))%>%
filter(`Reporting Category` == "Energy"|`Reporting Category`== "Motion, Forces, and Interactions"| `Reporting Category` == "Waves" )
#view(SG9_CU306Dis)
SG9_CU306Dis
Introductory Physics, SG9_CU306NonDis
Read-In
SG9_CU306NonDis<-read_excel("_data/MCAS CU306 2022/CU306MCAS2022PhysicsGrade9ByDisability.xlsm",
sheet = "Non-Disabled Students",
col_names = c("Reporting Category", "Possible Points", "RT%Points",
"State%Points", "RT-State Diff"))%>%
filter(`Reporting Category` == "Energy"|`Reporting Category`== "Motion, Forces, and Interactions"| `Reporting Category` == "Waves" )
SG9_CU306NonDis
After examining the summary of MCAS_2022
(see appendix), I chose to
Filter:
SchoolID : There are several variables that identify our school, I removed all but one, testschoolcode
.
StudentPrivacy: I left the sasid
variable which is a student identifier number, but eliminated all values corresponding to students’ names.
dis
: We are a charter school within our own unique district, therefore any “district level” data is identical to our “school level” data.
Rename
I currently have not renamed variables, but there are some trends to note:
e
before most ELA
MCAS student item performance metric variablesm
before most Math
MCAS student item performance metric variabless
before most Science
MCAS student item performance metric variablesMutate
I left as doubles
mitem1
sgp
)Recode to char
town
Refactor as ord
mperflev
.Recode to date
dob
using lubridate.I am interested in analyzing the 9th Grade Science Performance. To do this, I will select a subset of our MCAS_2022
data frame which includes:
sperflev
.SG9_MCAS_2022 <- select(MCAS_2022, contains("sitem"), gender, grade, yrsinsch,
race, IEP, `plan504`, sattempt, sperflev, sperf2, sscaleds)%>%
filter((grade == 9) & sattempt != "N")
SG9_MCAS_2022<-select(SG9_MCAS_2022, !(contains("43")|contains("44")|contains("45")))
#view(SG9_MCAS_2022)
head(SG9_MCAS_2022)
When I compared this data frame to the State reported analysis, the state analysis only contains 68 students. Notably, my data frame has 69 entries while the state is reporting data on only 68 students. I will have to investigate this further.
Since I will join this data frame with the SG9_Item
, using sitem
as the key, I need to pivot this data set longer.
As expected, we now have 42 X 69 = 2898 rows.
Now, we should be ready to join our data sets using sitem
as the key. We should have a 2,898 by (10 + 8) = 2,898 by 18 data frame. We will also check our raw data against the performance data reported by the state in the item report by calculating percent_earned
by Rising Tide students and comparing it to the figure RT Percent Points
and storing the difference in earned_diff
As expected, we now have a 2,898 X 18 data frame and the earned_diff
values all round to 0.
Now we can examine the content of the exam itself and our students’ performance relative to the state.
What reporting categories were emphasized by the state?
We can see from our summary that 50% of the exam points (30 of the available 60) come from questions from the Motion and Forces Reporting Category
, followed by 30% from Energy, and 20% from Waves.
SG9_Cat_Total<-SG9_Item%>%
select(`sitem`, `item Possible Points`, `Reporting Category`)%>%
group_by(`Reporting Category`)%>%
summarise(available_points = sum(`item Possible Points`, na.rm=TRUE))%>%
mutate(percent_available_points = available_points/(sum(available_points, na.rm = TRUE)))
SG9_Cat_Total
ggplot(SG9_Cat_Total, aes(x='',fill = `Reporting Category`, y = `available_points`)) +
geom_bar(position="fill", stat = "identity") + coord_flip()+
labs(subtitle ="All Students" ,
y = "% Points Available",
x= "Reporting Category",
title = "Percentage of Exam Points Available by Reporting Category",
caption = "2022 HS Introductory Physics MCAS")+
theme(axis.text.x=element_text(angle=60,hjust=1))
Where did Rising Tide students lose most of their points?
The proportion of points lost by Rising Tide students corresponds to the proportion of points available for each Reporting Category
of the the exam. This suggests that our students are prepared consistently across the units in the Reporting Categories
.
SG9_Cat_Loss<-SG9_StudentItem%>%
select(`sitem`, `Reporting Category`, `item Possible Points`, `sitem_score`)%>%
group_by(`Reporting Category`)%>%
summarise(sum_points_lost = sum(`item Possible Points`-`sitem_score`, na.rm=TRUE),
available_points = sum(`item Possible Points`, na.rm=TRUE))%>%
mutate(percent_points_lost = round(sum_points_lost/sum(sum_points_lost,na.rm=TRUE),2))%>%
mutate(percent_available_points = available_points/(sum(available_points, na.rm = TRUE)))
SG9_Cat_Loss<-SG9_Cat_Loss%>%
select(`Reporting Category`, `percent_available_points`, `percent_points_lost`)
SG9_Cat_Loss
SG9_Percent_Loss<-SG9_StudentItem%>%
select(`sitem`, `Reporting Category`, `item Possible Points`, `sitem_score`)%>%
mutate(`points_lost` = `item Possible Points` - `sitem_score`)%>%
#ggplot(df, aes(x='', fill=option)) + geom_bar(position = "fill")
ggplot( aes(x='',fill = `Reporting Category`, y = `points_lost`)) +
geom_bar(position="fill", stat = "identity") + coord_flip()+
labs(subtitle ="All Rising Tide Students" ,
y = "% Points Loints",
x= "Reporting Category",
title = "Percentage of Points Lost by Reporting Category",
caption = "2022 HS Introductory Physics MCAS")+
theme(axis.text.x=element_text(angle=60,hjust=1))
SG9_Percent_Loss
Did Rising Tide students’ performance relative to the state vary by content reporting categories?
We can see from our table that on average our students earned between 4 and 5 percent fewer of the available points relative to their peers in the state for items in each of the three reporting Categories
.
SG9_Cat_RTState<-SG9_Item%>%
select(`sitem`, `item Possible Points`, `Reporting Category`, `State Percent Points`, `RT Percent Points`, `RT-State Diff`)%>%
group_by(`Reporting Category`)%>%
summarise(available_points = sum(`item Possible Points`, na.rm=TRUE),
RT_points = sum(`RT Percent Points`*`item Possible Points`, na.rm = TRUE),
RT_Percent_Points = 100*round(RT_points/available_points,2),
State_Percent_Points = 100*round(sum(`State Percent Points`*`item Possible Points`/available_points, na.rm = TRUE),2))%>%
mutate(`RT-State Diff` = round(RT_Percent_Points - State_Percent_Points, 2))%>%
ggplot( aes(fill = `Reporting Category`, y=`RT-State Diff`, x=`Reporting Category`)) +
geom_bar(position="dodge", stat="identity") +
labs(subtitle ="All Students" ,
y = "RT-State Diff",
x= "Reporting Category",
title = "Difference in RT vs State Percent Points Earned by Reporting Category",
caption = "2022 HS Introductory Physics MCAS")+
theme(axis.text.x=element_text(angle=60,hjust=1))+
geom_text(aes(label = `RT-State Diff`), vjust = -1., colour = "white", position = position_dodge(.9))
SG9_Cat_RTState
Here we see the distribution of RT-State Diff
(difference between the percentage of points earned on a given item by Rising Tide students and percentage of points earned on the same item by their peers in the State) by sitem
and content Reporting Category
. We can see generally that items in the Motion and Forces Reporting Category
seems to display the most concerning variability in student performance relative to the state. It would be worth looking at the specific question strands with the Physics Teachers. (It would be helpful to add item labels to the dots using ggplotly
, however I did not find a way to have that render on the class blog)
SG9_Cat_Box <-SG9_Item%>%
select(`sitem`, `Reporting Category`, `State Percent Points`, `RT-State Diff`)%>%
group_by(`Reporting Category`)%>%
ggplot( aes(x=`Reporting Category`, y=`RT-State Diff`, fill=`Reporting Category`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.1, alpha=0.9) +
theme_ipsum() +
theme(
legend.position="none",
plot.title = element_text(size=11)
) +
ggtitle("G9 Introductory Physics School State Difference by Item") +
xlab("")
SG9_Cat_Box
Can differences in Rising Tide student performance on an item and State performance on an item be explained by the difficulty level of an item?
When considering RT-State Diff
against State Percent Points
for each sitem
on the MCAS, this does not seem to generally be the case. Although the regression line shows RT-State Diff
more likely to be negative on items where students in the State earned fewer points, the p-value is not significant.
G9Sci_Diff_Dot<-SG9_Item%>%
select(`State Percent Points`, `RT-State Diff`, `Reporting Category`)%>%
ggplot( aes(x=`State Percent Points`, y=`RT-State Diff`)) +
geom_point(size = 1, color="#69b3a2")+
geom_smooth(method="lm",color="grey", size =.5 )+
labs(title = "RT-State Diff by Difficulty Level", y = "RT-State Diff",
x = "State Percent Points") +
stat_cor(method = "pearson")#+facet(vars(`Reporting Category`)) +#label.x = 450, label.y = 550)
G9Sci_Diff_Dot
How did students perform based on key words?
item Desc
entries in the SG9_Item
data frame, there are several questions containing the word “Calculate” in their description.
How much is calculation emphasized on this exam and how did Rising Tide students perform relative to their peers in the state on items containing “calculate” in their description?
SG9_Calc<-SG9_Item%>%
select(`sitem`, `item Desc`,`item Possible Points`, `Reporting Category`, `State Percent Points`, `RT-State Diff`)%>%
mutate( key_word = case_when(
!str_detect(`item Desc`, "calculate|Calculate") ~ "Non-Calc",
str_detect(`item Desc`, "calculate|Calculate") ~ "Calc"))
#view(SG9_Calc)
SG9_Calc
Now, we can see that by the Waves and Energy categories half of the available points come from questions with calculate and half do not. In the Motion and Forces category, 40% of points are associated with questions that ask students to “calculate”.
SG9_Calc%>%
group_by(`Reporting Category`, `key_word`)%>%
summarise(avg_RT_State_Diff = mean(`RT-State Diff`, na.rm=TRUE),
med_RT_State_Diff = median(`RT-State Diff`, na.rm =TRUE),
#sum_RT_State_Diff = sum(`RT-State Diff`, na.rm=TRUE),
sum_sitem_Possible_Points = sum(`item Possible Points`, na.rm = TRUE))
SG9_Calc_PointsAvail<-SG9_Calc%>%
group_by(`Reporting Category`, `key_word`)%>%
summarise(avg_RT_State_Diff = mean(`RT-State Diff`, na.rm=TRUE),
med_RT_State_Diff = median(`RT-State Diff`, na.rm =TRUE),
sum_RT_State_Diff = sum(`RT-State Diff`, na.rm=TRUE),
sum_item_Possible_Points = sum(`item Possible Points`, na.rm = TRUE))%>%
ggplot(aes(fill=`key_word`, y=sum_item_Possible_Points, x=`Reporting Category`)) + geom_bar(position="dodge", stat="identity")+
labs(subtitle ="Calculate" ,
y = "Available Points",
x= "Reporting Category",
title = "Available points by Key Word",
caption = "2022 HS Introductory Physics MCAS")+
theme(axis.text.x=element_text(angle=60,hjust=1))+
geom_text(aes(label = `sum_item_Possible_Points`), vjust = 1.5, colour = "white", position = position_dodge(.9))
SG9_Calc_PointsAvail
When we compare the median RT-State Diff
for items containing the word “calculate” in their description vs. items that do not, we can see that across all of the Reporting Categories
Rising Tide students performed significantly weaker relative to their peers in the state on questions that asked them to “calculate”.
SG9_Calc_MedDiffBar<-SG9_Calc%>%
group_by(`Reporting Category`, `key_word`)%>%
summarise(mean_RT_State_Diff = round(mean(`RT-State Diff`, na.rm=TRUE),2),
med_RT_State_Diff = median(`RT-State Diff`, na.rm =TRUE),
sum_RT_State_Diff = sum(`RT-State Diff`, na.rm=TRUE))%>%
ggplot(aes(fill=`key_word`, y=med_RT_State_Diff, x=`Reporting Category`)) + geom_bar(position="dodge", stat="identity") + coord_flip()+
labs(subtitle ="Calculate" ,
y = "Median RT-State-Diff",
x= "Reporting Category",
title = "Median RT-State-Diff by Key Word",
caption = "2022 HS Introductory Physics MCAS")+
theme(axis.text.y=element_text(angle=40,hjust=.5))+
geom_text(aes(label = `med_RT_State_Diff`), hjust = 1, vjust = .75, colour = "black", position = position_dodge(.8))
SG9_Calc_MedDiffBar
Here we can see the distribution of RT-State Diff
by sitem
and Reporting Category
and the disparity in RT-State Diff
when we consider items asking students to “Calculate” vs. those that do not.
SG9_Calc_Box <-SG9_Calc%>%
group_by(`key_word`, `Reporting Category`)%>%
ggplot( aes(x=`key_word`, y=`RT-State Diff`, fill=`Reporting Category`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.1, alpha=0.9) +
theme_ipsum() +
theme(
#legend.position="none",
plot.title = element_text(size=11)
) + labs(subtitle ="Calculate" ,
y = "RT-State-Diff",
x= "Calculate vs. Non-Calculate",
title = "RT-State-Diff by Key Word",
caption = "2022 HS Introductory Physics MCAS")
# ggtitle("RT-State-Diff by Key Word") +
# xlab("")
SG9_Calc_Box
Did RT students perform worse relative to their peers in the state on more “challenging” calculation items?
If we consider the difficulty of items containing the word calculate
for students as reflected in the state-wide performance (State Percent Points
) for a given item, the gap between Rising Tide students’ performance to their peers in the state RT-State Diff
does not seem to increase significantly with the difficulty .
#view(SG9_Calc)
SG9_Calc_Dot<- SG9_Calc%>%
select(`State Percent Points`, `RT-State Diff`, `key_word`)%>%
filter(key_word == "Calc")%>%
ggplot( aes(x=`State Percent Points`, y=`RT-State Diff`)) +
geom_point(size = 1, color="#69b3a2")+
geom_smooth(method="lm",color="grey", size =.5 )+
labs(title = "RT State Diff vs. State Percent Points", y = "RT State Diff",
x = "State Percent Points")+
stat_cor(method = "pearson")
SG9_Calc_Dot
Is the “calculation gap” consistent across performance levels?
Here we can see that students with a higher performance level lost a greater proportion of their points on questions involving “Calculate”. I.e., the higher a student’s performance level, the greater the percentage of their points were lost to items asking them to “calculate”. This suggests that in the general classroom to raise student performance, students should spend a higher proportion of time on calculation based activities.
# G9 Points Lost
G9Sci_StudentCalcPerflev<-SG9_StudentItem%>%
select(gender, sitem, sitem_score, `item Desc`, `item Possible Points`, `State Percent Points`, IEP, `RT-State Diff`, `Reporting Category`, `sperflev`)%>%
mutate( key_word = case_when(
!str_detect(`item Desc`, "calculate|Calculate") ~ "Non-Calc",
str_detect(`item Desc`, "calculate|Calculate") ~ "Calc"))%>%
group_by(`sperflev`, `key_word`)%>%
summarise(total_points_lost = sum(`sitem_score`-`item Possible Points`, na.rm = TRUE),
med_RT_State_Diff = median(`RT-State Diff`, na.rm=TRUE))
G9Sci_StudentCalcPerflev
#view(SG9_StudentItem)
G9Sci_StudentCalcPerflev%>%
ggplot(aes(fill=`key_word`, y=total_points_lost, x=`sperflev`)) + geom_bar(position="fill", stat="identity") +
labs(subtitle ="Calculate" ,
y = "Percentage Points Lost",
x= "Performance Level",
title = "Percentage of Points Lost by Key Word and Performance Level",
caption = "2022 HS Introductory Physics MCAS")
Are there differences in the performance of non-disabled and disabled students relative to their academic peers in the state?
We can see from our CU306
reports that our students with disabilities performed better relative to their peers in the state, RT-State Diff
, across all Reporting Categories
, while our non-disabled students performed worse relative to their peers in the state across all Reporting Categories
. This suggest that more attention needs to be paid to the needs of the non-disabled students in the General Education setting.
When we examine the points lost by reporting category and disability status, there does not seem to be a significant difference in performance between disabled and non-disabled students across Reporting Categories
.
G9Sci_StudentCalcDis<-SG9_StudentItem%>%
select(gender, sitem, sitem_score, `item Desc`, `item Possible Points`, `State Percent Points`, IEP, `RT-State Diff`, `Reporting Category`, `sperflev`)%>%
mutate( key_word = case_when(
!str_detect(`item Desc`, "calculate|Calculate") ~ "Non-Calc",
str_detect(`item Desc`, "calculate|Calculate") ~ "Calc"))%>%
group_by(`Reporting Category`, `key_word`, `IEP`)%>%
summarise(total_points_lost = sum(`sitem_score`-`item Possible Points`, na.rm = TRUE))%>%
ggplot(aes(fill=`key_word`, y=total_points_lost, x=`Reporting Category`)) + geom_bar(position="dodge", stat="identity")+
facet_wrap(vars(IEP))+ coord_flip()+
labs(subtitle ="Calculate" ,
y = "Sum Points Lost",
x= "Reporting Category",
title = "Sum Points Lost by Key Word Non-Disabled vs. Disabled",
caption = "2022 HS Introductory Physics MCAS")+
geom_text(aes(label = `total_points_lost`), vjust = 1.5, colour = "black", position = position_dodge(.95))
#G9Sci_StudentCalcDis
G9Sci_StudentCalcDis<-SG9_StudentItem%>%
select(gender, sitem, sitem_score, `item Desc`, `item Possible Points`, `State Percent Points`, IEP, `RT-State Diff`, `Reporting Category`, `sperflev`)%>%
mutate( key_word = case_when(
!str_detect(`item Desc`, "calculate|Calculate") ~ "Non-Calc",
str_detect(`item Desc`, "calculate|Calculate") ~ "Calc"))%>%
group_by(`Reporting Category`, `key_word`, `IEP`)%>%
summarise(sum_points_lost = sum(`sitem_score`-`item Possible Points`, na.rm = TRUE))%>%
ggplot(aes(fill=`key_word`, y=sum_points_lost, x=`Reporting Category`)) + geom_bar(position="fill", stat="identity")+
facet_wrap(vars(IEP))+ coord_flip()+
labs(subtitle ="Calculate" ,
y = "Percent Points Lost",
x= "Reporting Category",
title = "Percent Points Lost by Key Word and Disability Status",
caption = "2022 HS Introductory Physics MCAS")
G9Sci_StudentCalcDis
A student’s performance on their 9th Grade Introductory Physics MCAS is strongly associated with their performance on their 8th Grade Math MCAS exam. This suggests that the use of prior Math MCAS and current STAR Math testing data can identify students in need of extra support.
SG9_Math<-MCAS_2022%>%
select(sscaleds, mscaleds2021,sscaleds_prior, grade, sattempt)%>%
filter((grade == 9) & sattempt != "N")%>%
ggplot(aes(x=`mscaleds2021`, y =`sscaleds`))+
geom_point(size = 1, color="#69b3a2")+
geom_smooth(method="lm",color="grey", size =.5 )+
labs(title = "2022 HS Introductory Physics vs. 2021 Math MCAS", y = "Physics Scaled Score",
x = "Math Scaled Score") +
stat_cor(method = "pearson", label.x = 450, label.y = 550)
SG9_Math
Rising Tide students as a whole performed slightly weaker relative to the state in all content reporting areas; however, students classified as disabled performed better relative to their peers in the state. The performance gap between Rising Tide students and students in the state on the HS Introductroy Physics exam is accounted for by the performance of the non-disabled students in the general classroom setting.
All Rising Tide students, regardless of disability status, performed significantly weaker relative to students in the State on items including the key word “Calculate” in their item description
. This suggests that we should dedicate more classroom instructional time to problem solving with calculation. Notably, the higher a student’s performance level, the higher the percentage of points a student lost for calculation items. The largest area of growth for students across all performance categories is on calculation based items; evidence based math interventions include small group, differentiated problem sets.
The discrepancy in performance by Rising Tide students with and without disabilities relative to their associated academic peers in the state, suggest that our non-disabled students would benefit from some of the practices and supports currently provided to our students on IEPs. Differentiated, tiered, small group problem sets in the general classroom setting could potentially address the “calculation gap”.
I was inspired to work on this report after years of experience working at a public school. Public education is a sector that is filled with passion and positive intentions but also divisive discussions. There exist a plethora of simplistic “one-trick fixes” that are marketed to students, teachers, and families. The use of data is the best tool we have against pressing forward and investing our precious time and money with initiatives that do not improve student outcomes.
Over the years, I’ve noticed that teachers and leaders are given annual data reports yet, most lack the time, capacity, or resources to identify evidence based, actionable measures to enact in the classroom or at the organizational level. When presented with all of the questions from an assessment individually and the performance of all of one’s students on paper, it is difficult to identify trends. Anecdotally, I have noticed every year the majority of teachers gravitating to the scores and performance of individual students that they previously taught and ascribing mistakes or successes to specific experiences with an individual or one word in a question prompt. While relationship building and teaching to a child are hallmarks to student-teacher relationships, a narrow lens like this will not allow a teacher to identify classroom level changes or curriculum level changes that could impact all students and future students. In one’s compassionate focus on individuals, a great opportunity to promote the learning for all students is lost.
With the use of R, and the MCAS reports, I decided to focus on ways to identify trends at the classroom or curricular level. I found it challenging to limit the scope of my work for this project. Also, I struggled with discerning when to use sum
vs. when to use averages
or medians
. To improve a student’s performance on a test, we are concerned with total points lost and relative weight of a content category; to identify curricular weaknesses we are also interested in relative performance to the state by content area.
I only completed the analysis of the Introductory Physics Exam for High school students. I have ELA
, Math
, and Science
results for grades 5-8 as well as grade 10. I am still working on building a general function library to generate similar graphics and tables for other content areas and grade levels and I would like to complete a similar report for each grade level and subject area assessment for teachers to use.
Given access to historical data, I think it would be beneficial to examine these trends over time to discern the performance gaps attributable to changes in the population of students (a factor which we cannot control or change) vs. those attributable to curriculum and teaching (an area we can influence and effect change).
I also have access to reports that include the teacher a student had and the grades they earned from their teacher in the year they were assessed on the MCAS. I would like to examine the relationship between a student’s performance as measured by their teachers compared to their performance level as measured by the state. Are their patterns to the groups of students with the largest discrepancy between these two metrics? This would be important data to support the teaching and learning at our school.
On a broader scale, I think that I need to develop a stronger sense for what summary statistics are the most meaningful for a given variable to identify potential trends or insights and subsequently what visualizations best convey these insights to a reader. I would also like to develop a tool-kit of best practices for “checking against my own biases”. What set of metrics can I perform to best control for my potential mistakes as a human being with a limited perspective?
Chang, W. (2022). R Graphics Cookbook, 2nd Edition. O’Reilly Media.
Grolemund, G., & Wickham, H. (2016). R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly Media.
HighSchool Introductory Physics Item Report [Data] https://profiles.doe.mass.edu/mcas/mcasitems2.aspx?grade=HS&subjectcode=PHY&linkid=23&orgcode=04830000&fycode=2022&orgtypecode=5&
H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2009.
Papay, J. P., Mantil, A., McDonough, A., Donahue, K., An, L., & Murnane, R. J. (n.d.). Lifting all boats? Accomplishments and Challenges from 20 Years of Education Reform in Massachusetts. Retrieved December 2, 2022, from https://annenberg.brown.edu/sites/default/files/LiftingAllBoats_FINAL.pdf
R Core Team. (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.https://www.r-project.org.
RStudio Team. (2019). RStudio: Integrated Development for R. RStudio, Inc., Boston, MA. https://www.rstudio.com.
Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). “Welcome to the tidyverse.” Journal of Open Source Software, 4(43), 1686. doi:10.21105/joss.01686 https://doi.org/10.21105/joss.01686.
For more information about the MCAS, see the Department of Elementary and Secondary Education’s (DESE) page.
variable | Measurement Level | Values |
---|---|---|
gender |
Nominal | the reported gender identify of the student. Female: F, Male: M, Non-binary: N |
item Description |
Nominal | details of assessment question |
item Possible Points |
Discrete | The number of points available for a given sitem |
Reporting Category |
Nominal | content area of sitem |
Motion and Forces | ||
Waves | ||
Energy | ||
RT Percent Points |
Continuous | Percent of points earned by Rising Tide Students for a given sitem |
RT-State Diff |
Discrete | Difference between percent of points earned by Rising Tide Students and Students in the State for a given sitem |
sitem |
Nominal | The question number the MCAS exam |
sitem_score |
Discrete | The number of points a student earned on a given sitem |
sperflev |
Ordinal | The student’s performance level |
Exceeds Expectations | ||
Meets Expectations | ||
Partially Meets Expectations | ||
Does Not Meet Expectations | ||
sscaleds |
Discrete | The student’s scaled score by subject area (e: English, m: Math, s: Science) |
ssgp |
Continuous | The student’s growth percentile by subject area (e: English, m: Math, s: Science) |
State Percent Points |
Continuous | Percent of points earned by Massachusetts students for a given sitem |
Variable | Stats / Values | Freqs (% of Valid) | Graph | Missing | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
adminyear [numeric] | 1 distinct value |
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
testschoolcode [character] | 1. 4830305 |
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
grade [factor] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
gradesims [numeric] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
dob [Date] |
|
427 distinct values | 0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
gender [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
race [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
yrsinmass [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
yrsinmass_num [numeric] |
|
12 distinct values | 0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
yrsinsch [numeric] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
highneeds [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
lowincome [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
title1 [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ever_EL [character] | 1. 1 |
|
475 (96.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
EL [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
EL_FormerEL [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
FormerEL [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ELfirstyear [character] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
IEP [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
plan504 [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
firstlanguage [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
natureofdis [numeric] |
|
|
380 (76.8%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
levelofneed [factor] |
|
|
380 (76.8%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
spedplacement [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
town [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
county [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
octenr [numeric] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
conenr_sch [numeric] | 1 distinct value |
|
440 (88.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
conenr_sta [numeric] | 1 distinct value |
|
434 (87.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
access_part [numeric] | 1 distinct value |
|
488 (98.6%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ealt [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ecomplexity [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
emode [character] | 1. O |
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eteststat [character] |
|
|
69 (13.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
wptopdev [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
wpcompconv [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem1 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem2 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem3 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem4 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem5 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem6 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem7 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem8 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem9 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem10 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem11 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem12 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem13 [numeric] |
|
|
75 (15.2%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem14 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem15 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem16 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem17 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem18 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem19 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem20 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem21 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem22 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem23 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem24 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem25 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem26 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem27 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem28 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem29 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem30 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem31 [numeric] |
|
|
135 (27.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem32 [numeric] |
|
|
403 (81.4%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem33 [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem34 [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem35 [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem36 [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem37 [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem38 [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem39 [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eitem40 [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
erawsc [numeric] |
|
39 distinct values | 73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
emcpts [numeric] |
|
24 distinct values | 73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eorpts [numeric] |
|
28 distinct values | 73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eperpospts [numeric] |
|
63 distinct values | 73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
escaleds [numeric] |
|
74 distinct values | 74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eperflev [ordered, factor] |
|
|
69 (13.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eperf2 [ordered, factor] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
enumin [numeric] | 1 distinct value |
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eassess [numeric] |
|
|
70 (14.1%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
esgp [numeric] |
|
96 distinct values | 109 (22.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
idea1 [character] |
|
|
69 (13.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
conv1 [character] |
|
|
69 (13.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
idea2 [character] |
|
|
69 (13.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
conv2 [character] |
|
|
69 (13.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
idea3 [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
conv3 [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eattempt [character] |
|
|
69 (13.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
malt [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mcomplexity [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mmode [character] | 1. O |
|
71 (14.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mteststat [character] |
|
|
69 (13.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem1 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem2 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem3 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem4 [numeric] |
|
|
75 (15.2%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem5 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem6 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem7 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem8 [numeric] |
|
|
76 (15.4%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem9 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem10 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem11 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem12 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem13 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem14 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem15 [numeric] |
|
|
76 (15.4%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem16 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem17 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem18 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem19 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem20 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem21 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem22 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem23 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem24 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem25 [numeric] |
|
|
75 (15.2%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem26 [numeric] |
|
|
72 (14.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem27 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem28 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem29 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem30 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem31 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem32 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem33 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem34 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem35 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem36 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem37 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem38 [numeric] |
|
|
74 (14.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem39 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem40 [numeric] |
|
|
73 (14.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem41 [numeric] |
|
|
432 (87.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mitem42 [numeric] |
|
|
432 (87.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mrawsc [numeric] |
|
51 distinct values | 72 (14.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mmcpts [numeric] |
|
22 distinct values | 72 (14.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
morpts [numeric] |
|
38 distinct values | 72 (14.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mperpospts [numeric] |
|
67 distinct values | 72 (14.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mscaleds [numeric] |
|
80 distinct values | 72 (14.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mperflev [ordered, factor] |
|
|
69 (13.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mperf2 [ordered, factor] |
|
|
72 (14.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mnumin [numeric] | 1 distinct value |
|
72 (14.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
massess [numeric] |
|
|
70 (14.1%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
msgp [numeric] |
|
97 distinct values | 107 (21.6%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mattempt [character] |
|
|
69 (13.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
salt [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
scomplexity [logical] |
|
495 (100.0%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
smode [character] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
steststat [character] |
|
|
183 (37.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ssubject [character] |
|
|
363 (73.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem1 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem2 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem3 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem4 [numeric] |
|
|
240 (48.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem5 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem6 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem7 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem8 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem9 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem10 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem11 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem12 [numeric] |
|
|
240 (48.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem13 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem14 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem15 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem16 [numeric] |
|
|
242 (48.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem17 [numeric] |
|
|
240 (48.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem18 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem19 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem20 [numeric] |
|
|
240 (48.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem21 [numeric] |
|
|
243 (49.1%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem22 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem23 [numeric] |
|
|
240 (48.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem24 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem25 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem26 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem27 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem28 [numeric] |
|
|
240 (48.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem29 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem30 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem31 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem32 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem33 [numeric] |
|
|
240 (48.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem34 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem35 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem36 [numeric] |
|
|
240 (48.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem37 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem38 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem39 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem40 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem41 [numeric] |
|
|
239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem42 [numeric] |
|
|
418 (84.4%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem43 [numeric] |
|
|
487 (98.4%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem44 [numeric] |
|
|
488 (98.6%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sitem45 [numeric] |
|
|
488 (98.6%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
srawsc [numeric] |
|
43 distinct values | 239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
smcpts [numeric] |
|
26 distinct values | 239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sorpts [numeric] |
|
33 distinct values | 239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sperpospts [numeric] |
|
59 distinct values | 239 (48.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sscaleds [numeric] |
|
91 distinct values | 185 (37.4%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sperflev [ordered, factor] |
|
|
183 (37.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sperf2 [ordered, factor] |
|
|
183 (37.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
snumin [numeric] | 1 distinct value |
|
254 (51.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sassess [numeric] |
|
|
252 (50.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sattempt [character] |
|
|
183 (37.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ela_cd [numeric] |
|
|
363 (73.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
math_cd [numeric] |
|
|
363 (73.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sci_cd [numeric] |
|
|
363 (73.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
accom_e [numeric] | 1 distinct value |
|
419 (84.6%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
accom_m [numeric] | 1 distinct value |
|
417 (84.2%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
accom_s [numeric] | 1 distinct value |
|
448 (90.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
accom_readaloud [character] |
|
|
492 (99.4%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
accom_scribe [character] | 1. H |
|
493 (99.6%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
accom_calculator [numeric] | 1 distinct value |
|
493 (99.6%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
grade2018 [numeric] |
|
|
224 (45.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
grade2019 [numeric] |
|
|
134 (27.1%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
grade2021 [numeric] |
|
|
94 (19.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
escaleds2018 [numeric] |
|
61 distinct values | 229 (46.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
escaleds2019 [numeric] |
|
71 distinct values | 138 (27.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
escaleds2021 [numeric] |
|
83 distinct values | 96 (19.4%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mscaleds2018 [numeric] |
|
71 distinct values | 229 (46.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mscaleds2019 [numeric] |
|
77 distinct values | 138 (27.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mscaleds2021 [numeric] |
|
83 distinct values | 95 (19.2%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
esgp2018 [numeric] |
|
81 distinct values | 316 (63.8%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
esgp2019 [numeric] |
|
91 distinct values | 231 (46.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
esgp2021 [numeric] |
|
88 distinct values | 201 (40.6%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
msgp2018 [numeric] |
|
85 distinct values | 316 (63.8%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
msgp2019 [numeric] |
|
92 distinct values | 231 (46.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
msgp2021 [numeric] |
|
82 distinct values | 200 (40.4%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
summarize [numeric] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
amend [character] | 1. M |
|
494 (99.8%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
datachanged [numeric] | 1 distinct value |
|
494 (99.8%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eScaleForm [numeric] | 1 distinct value |
|
69 (13.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mScaleForm [numeric] | 1 distinct value |
|
69 (13.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sScaleForm [numeric] | 1 distinct value |
|
307 (62.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
eFormType [character] | 1. C |
|
69 (13.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mFormType [character] | 1. C |
|
69 (13.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sFormType [character] |
|
|
183 (37.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
days_in_person [numeric] |
|
53 distinct values | 0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
member [numeric] |
|
22 distinct values | 0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ssubject_prior [numeric] |
|
|
435 (87.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sscaleds_prior [numeric] |
|
24 distinct values | 435 (87.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
escaleds.legacy.equivalent [numeric] |
|
14 distinct values | 433 (87.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
mscaleds.legacy.equivalent [numeric] |
|
24 distinct values | 432 (87.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sscaleds.legacy.equivalent [numeric] |
|
26 distinct values | 425 (85.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sscaleds.highest.on.legacy.scale [numeric] |
|
30 distinct values | 363 (73.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
scpi [numeric] |
|
|
432 (87.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sscaleds.highest.on.nextGen.scale [numeric] |
|
24 distinct values | 432 (87.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
sperf2.highest.on.nextGen.scale [character] |
|
|
432 (87.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
nature0fdis [character] |
|
|
380 (76.8%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
year [numeric] | 1 distinct value |
|
0 (0.0%) |
Generated by summarytools 1.0.1 (R version 4.2.1)
2022-12-23
---
title: "Final Project"
author: "Theresa Szczepanski"
desription: "MCAS G9 Science Analysis"
date: "12/12/2022"
format:
html:
df-print: paged
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- Theresa_Szczepanski
- final_project
- MCAS_2022
- SG9_Item
always_allow_html: true
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
library(ggplot2)
library(lubridate)
library(readxl)
library(hrbrthemes)
library(viridis)
library(ggpubr)
library(purrr)
library(plotly)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
## Introduction
Massachusetts Comprehensive Assessment System (MCAS) tests were introduced as part
of the Massachusetts Education Reform Act in 1993 with the goal of providing all
students with the skills and knowledge to thrive in a "complex and changing society" (Papay et. al, 2020 pp, 1). The MCAS tests are a significant tool for educational equity. Scores on the Grade 10 Math MCAS test "predict longer-term educational attainments and labor market success, above and beyond typical markers of student advantage. For example, among
demographically similar students who attended the same high school and have the same level of ultimate educational attainment, those with higher MCAS mathematics scores go on to have much higher average earnings than those with lower scores." (Papay et. al, 2020 pp 7-10)
In this report, I will analyze the Spring 2022 MCAS Results for students completing the High School Introductory Physics MCAS at [Rising Tide Charter Public School](https://risingtide.org/).
The `MCAS_2022` data frame contains performance results from 495 students from
Rising Tide on the Spring 2022
[Massachusetts Comprehensive Assessment System (MCAS)](https://www.doe.mass.edu/mcas/default.html)
tests.
For each student, there are values reported for 256 different variables which
consist of information from four broad categories
- *Demographic characteristics* of
the students themselves (e.g., race, gender, date of birth, town, grade level,
years in school, years in Massachusetts, and low income, title1, IEP, 504,
and EL status ).
- *Key assessment features* including subject, test format, and
accommodations provided
- *Performance metrics*: This includes a student's score on individual item strands,
e.g.,`sitem1`-`sitem42`.
See the `MCAS_2022` data frame summary and __codebook__ in the __appendix__ for further details.
The second data set, `SG9_Item`, is $42 \times 9$ and consists of
9 variables with information pertaining to the 42 questions on the 2022 [HS Introductory Physics Item Report](https://profiles.doe.mass.edu/mcas/mcasitems2.aspx?grade=HS&subjectcode=PHY&linkid=23&orgcode=04830000&fycode=2022&orgtypecode=5&). The variables can be broken down into 2 categories:
Details about the content of a given test item:
This includes the content `Reporting Category` (MF (motion and forces)
WA (waves), and EN (energy)), the `Standard` from the [2016 STE Massachusetts Curriculum Framework](https://www.doe.mass.edu/frameworks/scitech/2016-04.pdf), the `Item Description` providing the details of what specifically was asked of students, and the points
available for a given question, `item Possible Points`.
Summary Performance Metrics:
- For each item, the state reports the percentage of points earned by students at
Rising Tide, `RT Percent Points`, the percentage of available points earned by students
in the state, `State Percent Points`, and the difference between the percentage of points earned by Rising Tide students and the percentage of points earned by students in the state, `RT-State Diff`.
- Lastly, `SG9_CU306Dis` and `SG9_CU306NonDis` are $3 \times 5$ dataframes consisting of summary performance data by `Reporting Category` for students with disabilities and without disabilities; most importantly including `RT Percent Points` and `State Percent Points`by disability status.
When considering our student performance data, we hope to address the following broad questions:
<style>
div.blue { background-color:#e6f0ff; border-radius: 5px; padding: 20px;}
</style>
<div class = "blue">
- What adjustments (if any) should be made at the Tier 1 level, i.e., curricular adjustments
for all students in the General Education setting?
- What would be the most beneficial areas of focus for a targeted intervention course for
students struggling to meet or exceed performance expectations?
- Are there notable differences in student performance for students with and without disabilities?
</div>
## Function Library
To read in, tidy, and join our data frames for each content area we will use functions. In this library. I am also drafting some functions that I would use to scale up this project. There is still work to be done here.
:::panel-tabset
### Item analysis Read in Function
```{r}
#Item analysis Read in Function: Input: sheet_name, subject, grade; return: student item report for a given grade level and subject.
#subject must be: "math", "ela", or "science"
read_item<-function(sheet_name, subject, grade){
subject_item<-case_when(
subject == "science"~"sitem",
subject == "math"~"mitem",
subject == "ela"~"eitem"
)
if(subject == "science"){
read_excel("_data/2022MCASDepartmentalAnalysis.xlsx", sheet = sheet_name,
skip = 1, col_names= c(subject_item, "Type", "Reporting Category", "Standard", "item Desc", "delete", "item Possible Points","RT Percent Points", "State Percent Points", "RT-State Diff")) %>%
select(!contains("delete"))%>%
filter(!str_detect(sitem,"Legend|legend"))%>%
mutate(sitem= as.character(sitem))%>%
separate(c(1), c("sitem", "delete"))%>%
select(!contains("delete"))%>%
mutate(sitem =
str_c(subject_item, sitem))
}
else if(subject == "math" && grade < 10){
read_excel("_data/2022MCASDepartmentalAnalysis.xlsx", sheet = sheet_name,
skip = 1, col_names= c(subject_item, "Type", "Reporting Category", "Standard", "item Desc", "delete", "item Possible Points","delete","RT Percent Points", "State Percent Points", "RT-State Diff"))%>%
select(!contains("delete"))%>%
filter(!str_detect(mitem,"Legend|legend"))%>%
mutate(mitem = as.character(mitem))%>%
separate(c(1), c("mitem", "delete"))%>%
select(!contains("delete"))%>%
mutate(mitem =
str_c(subject_item, mitem))
}
else if(subject == "math" && grade == 10){
read_excel("_data/2022MCASDepartmentalAnalysis.xlsx", sheet = sheet_name,
skip = 1, col_names= c(subject_item, "Type", "Reporting Category", "Standard", "item Desc", "delete", "item Possible Points","RT Percent Points", "State Percent Points", "RT-State Diff"))%>%
select(!contains("delete"))%>%
filter(!str_detect(mitem,"Legend|legend"))%>%
mutate(mitem = as.character(mitem))%>%
separate(c(1), c("mitem", "delete"))%>%
select(!contains("delete"))%>%
mutate(mitem =
str_c(subject_item, mitem))
}
}
```
### Function to Read in MCAS Preliminary Results
```{r}
## MCAS Preliminary Results Read In
## Input file_path where the results csv file is stored, and the "year" the exam was administered
read_MCAS_Prelim<-function(file_path, year){read_csv(file_path,
skip=1)%>%
select(-c("sprp_dis", "sprp_sch", "sprp_dis_name", "sprp_sch_name", "sprp_orgtype",
"schtype", "testschoolname", "yrsindis", "conenr_dis"))%>%
#Recode all nominal variables as characters
mutate(testschoolcode = as.character(testschoolcode))%>%
#Include this line when using the non-private dataframe
# mutate(sasid = as.character(sasid))%>%
mutate(highneeds = as.character(highneeds))%>%
mutate(lowincome = as.character(lowincome))%>%
mutate(title1 = as.character(title1))%>%
mutate(ever_EL = as.character(ever_EL))%>%
mutate(EL = as.character(EL))%>%
mutate(EL_FormerEL = as.character(EL_FormerEL))%>%
mutate(FormerEL = as.character(FormerEL))%>%
mutate(ELfirstyear = as.character(ELfirstyear))%>%
mutate(IEP = as.character(IEP))%>%
mutate(plan504 = as.character(plan504))%>%
mutate(firstlanguage = as.character(firstlanguage))%>%
mutate(nature0fdis = as.character(natureofdis))%>%
mutate(spedplacement = as.character(spedplacement))%>%
mutate(town = as.character(town))%>%
mutate(ssubject = as.character(ssubject))%>%
#Recode all ordinal variable as factors
mutate(grade = as.factor(grade))%>%
mutate(levelofneed = as.factor(levelofneed))%>%
mutate(eperf2 = recode_factor(eperf2,
"E" = "Exceeding",
"M" = "Meeting",
"PM" = "Partially Meeting",
"NM"= "Not Meeting",
.ordered = TRUE))%>%
mutate(eperflev = recode_factor(eperflev,
"E" = "E",
"M" = "M",
"PM" = "PM",
"NM"= "NM",
"DNT" = "DNT",
"ABS" = "ABS",
.ordered = TRUE))%>%
mutate(mperf2 = recode_factor(mperf2,
"E" = "Exceeding",
"M" = "Meeting",
"PM" = "Partially Meeting",
"NM"= "Not Meeting",
.ordered = TRUE))%>%
mutate(mperflev = recode_factor(mperflev,
"E" = "E",
"M" = "M",
"PM" = "PM",
"NM"= "NM",
"INV" = "INV",
"ABS" = "ABS",
.ordered = TRUE))%>%
# The science variables contain a mixture of legacy performance levels and
# next generation performance levels which needs to be addressed in the ordering
# of these factors.
mutate(sperf2 = recode_factor(sperflev,
"E" = "Exceeding",
"M" = "Meeting",
"PM" = "Partially Meeting",
"NM"= "Not Meeting",
.ordered = TRUE))%>%
mutate(sperflev = recode_factor(sperf2,
"E" = "E",
"M" = "M",
"PM" = "PM",
"NM"= "NM",
"INV" = "INV",
"ABS" = "ABS",
.ordered = TRUE))%>%
#recode DOB using lubridate
mutate(dob = mdy(dob,
quiet = FALSE,
tz = NULL,
locale = Sys.getlocale("LC_TIME"),
truncated = 0
))%>%
mutate(IEP = case_when(
IEP == "1" ~ "Disabled",
IEP == "0" ~ "NonDisabled"
))%>%
mutate(year = year)
}
```
### Functions for Item Report/Exam Structure
```{r}
##Function for number of items table and graph
##ToDo Should a Function Produce Table and Graph?
##ToDo, Adjust the caption for test and year?
##ToDo, the Data Files need to be Updated to Include ELA reports
Subject_Cat_Total<-function(subject, subjectItemDF){
if(subject == "science"){subjectItemDF%>%
select(`sitem`, `item Possible Points`, `Reporting Category`)%>%
group_by(`Reporting Category`)%>%
summarise(available_points = sum(`item Possible Points`, na.rm=TRUE))%>%
mutate(percent_available_points = available_points/(sum(available_points, na.rm = TRUE)))%>%
ggplot(aes(x='',fill = `Reporting Category`, y = `available_points`)) +
geom_bar(position="fill", stat = "identity") + coord_flip()+
labs(subtitle ="All Students" ,
y = "% Points Available",
x= "Reporting Category",
title = "Percentage of Exam Points Available by Reporting Category",
caption = "2022 HS Introductory Physics MCAS")+
theme(axis.text.x=element_text(angle=60,hjust=1))
}
else if (subject == "math"){subjectItemDF%>%
select(`mitem`, `item Possible Points`, `Reporting Category`)%>%
group_by(`Reporting Category`)%>%
summarise(available_points = sum(`item Possible Points`, na.rm=TRUE))%>%
mutate(percent_available_points = available_points/(sum(available_points, na.rm = TRUE)))%>%
ggplot(aes(x='',fill = `Reporting Category`, y = `available_points`)) +
geom_bar(position="fill", stat = "identity") + coord_flip()+
labs(subtitle ="All Students" ,
y = "% Points Available",
x= "Reporting Category",
title = "Percentage of Exam Points Available by Reporting Category",
caption = "2022 HS Introductory Physics MCAS")+
theme(axis.text.x=element_text(angle=60,hjust=1))
} else if (subject == "ELA"){subjectItemDF%>%
select(`eitem`, `item Possible Points`, `Reporting Category`)%>%
group_by(`Reporting Category`)%>%
summarise(available_points = sum(`item Possible Points`, na.rm=TRUE))%>%
mutate(percent_available_points = available_points/(sum(available_points, na.rm = TRUE)))%>%
ggplot(aes(x='',fill = `Reporting Category`, y = `available_points`)) +
geom_bar(position="fill", stat = "identity") + coord_flip()+
labs(subtitle ="All Students" ,
y = "% Points Available",
x= "Reporting Category",
title = "Percentage of Exam Points Available by Reporting Category",
caption = "2022 ELA MCAS")+
theme(axis.text.x=element_text(angle=60,hjust=1))
}
}
# testDF<-read_item("SG9Physics", "science")
# #view(testDF)
# Subject_Cat_Total("science", testDF)
```
### Function to Join Student Performance to Item Report
```{r}
Student_Item<-function(subject, gradeLevel, subjectItemDF, studentPerfDF){
if(subject == "science"){
select( studentPerfDF, contains("sitem"), gender, grade, yrsinsch,
race, IEP, `plan504`, sattempt, sperflev, sperf2, sscaleds)%>%
filter((grade == gradeLevel) & sattempt != "N")%>%
pivot_longer(contains("sitem"), names_to = "sitem", values_to = "sitem_score")%>%
left_join(subjectItemDF, "sitem")
}
if(subject == "math"){
select( studentPerfDF, contains("mitem"), gender, grade, yrsinsch,
race, IEP, `plan504`, mattempt, mperflev, mperf2, mscaleds)%>%
filter((grade == gradeLevel) & mattempt != "N")%>%
pivot_longer(contains("mitem"), names_to = "mitem", values_to = "mitem_score")%>%
left_join(subjectItemDF, "mitem")
}
####ToDo, update departmental analysis data to include ELA item reports
if(subject == "ela"){
select( studentPerfDF, contains("eitem"), gender, grade, yrsinsch,
race, IEP, `plan504`, eattempt, eperflev, eperf2, escaleds)%>%
filter((grade == gradeLevel) & eattempt != "N")%>%
pivot_longer(contains("eitem"), names_to = "eitem", values_to = "eitem_score")%>%
left_join(subjectItemDF, "eitem")
}
}
# TestMCAS_2022<-read_MCAS_Prelim("_data/PrivateSpring2022_MCAS_full_preliminary_results_04830305.csv",2022)
# SG5_Item<-read_item("SG5", "science", 5)
# SG5_Student_Item<-Student_Item("science", 5, SG5_Item, TestMCAS_2022)
# SG5_Student_Item
# TestMCAS_2022<-read_MCAS_Prelim("_data/PrivateSpring2022_MCAS_full_preliminary_results_04830305.csv",2022)
# MG5_Item<-read_item("MG5", "math", 5)
# MG5_Student_Item<-Student_Item("math", 5, MG5_Item, TestMCAS_2022)
# MG5_Student_Item
```
### Function Student Performance by KeyWord
```{r}
keyWord<-function(subjectItemDF, subject, keyWord){
keyWord<-str_to_lower(keyWord)
keyWordFirst<-str_to_upper(str_sub(keyWord, 1L,1L))
keyWordEnd<-str_sub(keyWord, 1L+1, -1L)
keyWordCap<-str_c(keyWordFirst, keyWordEnd)
if (subject == "science"){
select(subjectItemDF,`sitem`, `item Desc`,`item Possible Points`, `Reporting Category`, `State Percent Points`, `RT-State Diff`)%>%
mutate( key_word = case_when(
!(str_detect(`item Desc`, keyWord)|str_detect(`item Desc`,keyWordCap)) ~ str_c("Non-", keyWordCap),
str_detect(`item Desc`, keyWord)|str_detect(`item Desc`,keyWordCap) ~ keyWordCap))
}
else if (subject == "math"){
select(subjectItemDF, `mitem`, `item Desc`,`item Possible Points`, `Reporting Category`, `State Percent Points`, `RT-State Diff`)%>%
mutate( key_word = case_when(
!(str_detect(`item Desc`, keyWord)|str_detect(`item Desc`,keyWordCap)) ~ str_c("Non-", keyWordCap),
str_detect(`item Desc`, keyWord)|str_detect(`item Desc`,keyWordCap) ~ keyWordCap))
}
}
#view(SG9_Calc)
# MG8_Item<-read_item("MG8", "math", 8)
# MG5_Item
# MG8_Describe<-keyWord(MG8_Item, "math", "determine")
# MG8_Describe
# SG8_Item<-read_item("SG8", "science", 8)
# SG8_Item
# SG8_Calc<-keyWord(SG8_Item, "science", "calculate")
# SG8_Calc
```
:::
## Data Read-In Tidy
::: panel-tabset
### Read in Student Performance and Item Description Data
```{r}
#Filter, rename variables, and mutate values of variables on read-in
MCAS_2022<-read_MCAS_Prelim("_data/PrivateSpring2022_MCAS_full_preliminary_results_04830305.csv",2022)
#view(MCAS_2022)
head(MCAS_2022)
```
Introductory Physics, `SG9_Item` Read-In
```{r}
# G9 Science Item analysis
SG9_Item<-read_item("SG9Physics", "science")%>%
mutate(`Reporting Category` = case_when(
`Reporting Category` == "EN" ~ "Energy",
`Reporting Category` == "MF" ~ "Motion and Forces",
`Reporting Category` == "WA" ~ "Waves"
))
head(SG9_Item)
#view(SG9_Item)
```
Introductory Physics, `SG9_CU306Dis` Read-In
```{r}
SG9_CU306Dis<-read_excel("_data/MCAS CU306 2022/CU306MCAS2022PhysicsGrade9ByDisability.xlsm",
sheet = "Disabled Students",
col_names = c("Reporting Category", "Possible Points", "RT%Points",
"State%Points", "RT-State Diff"))%>%
filter(`Reporting Category` == "Energy"|`Reporting Category`== "Motion, Forces, and Interactions"| `Reporting Category` == "Waves" )
#view(SG9_CU306Dis)
SG9_CU306Dis
```
Introductory Physics, `SG9_CU306NonDis` Read-In
```{r}
SG9_CU306NonDis<-read_excel("_data/MCAS CU306 2022/CU306MCAS2022PhysicsGrade9ByDisability.xlsm",
sheet = "Non-Disabled Students",
col_names = c("Reporting Category", "Possible Points", "RT%Points",
"State%Points", "RT-State Diff"))%>%
filter(`Reporting Category` == "Energy"|`Reporting Category`== "Motion, Forces, and Interactions"| `Reporting Category` == "Waves" )
SG9_CU306NonDis
#view(SG9_CU306NonDis)
```
### Workflow Summary
After examining the summary of `MCAS_2022` (see appendix), I chose to
**Filter**:
- _SchoolID_ : There are several variables that identify our school, I removed all
but one, `testschoolcode`.
- _StudentPrivacy_: I left the `sasid` variable which is a student identifier number,
but eliminated all values corresponding to students' names.
- `dis`: We are a charter school within our own unique district, therefore any
"district level" data is identical to our "school level" data.
__Rename__
I currently have not renamed variables, but there are some trends to note:
- an `e` before most `ELA` MCAS student item performance metric variables
- an `m` before most `Math` MCAS student item performance metric variables
- an `s` before most `Science` MCAS student item performance metric variables
__Mutate__
I left as __doubles__
- variables that measured scores on specific MCAS items e.g., `mitem1`
- variables that measured student growth percentiles (`sgp`)
- variables that counted a student's years in the school system or state.
Recode to __char__
- variables that are __nominal__ but have numeric values, e.g., `town`
Refactor as __ord__
- variables that are __ordinal__, e.g., `mperflev`.
Recode to __date__
- `dob` using lubridate.
### Tidy Data
I am interested in analyzing the 9th Grade Science Performance. To do this, I will
select a subset of our `MCAS_2022` data frame which includes:
- 9th Grade students who took the Introductory Physics test
- Scores on the 42 Science Items
- points available on the
- Performance level on the test `sperflev`.
- Demographic characteristics of the students.
```{r}
SG9_MCAS_2022 <- select(MCAS_2022, contains("sitem"), gender, grade, yrsinsch,
race, IEP, `plan504`, sattempt, sperflev, sperf2, sscaleds)%>%
filter((grade == 9) & sattempt != "N")
SG9_MCAS_2022<-select(SG9_MCAS_2022, !(contains("43")|contains("44")|contains("45")))
#view(SG9_MCAS_2022)
head(SG9_MCAS_2022)
```
When I compared this data frame to the State reported analysis, the state analysis only contains
68 students. Notably, my data frame has 69 entries while the state is reporting data on only 68 students. I will have to investigate this further.
Since I will join this data frame with the `SG9_Item`, using `sitem` as the key, I need to pivot this data set longer.
```{r}
SG9_MCAS_2022<- pivot_longer(SG9_MCAS_2022, contains("sitem"), names_to = "sitem", values_to = "sitem_score")
#view(SG9_MCAS_2022)
head(SG9_MCAS_2022)
```
As expected, we now have 42 X 69 = 2898 rows.
### Join and Sanity Checks
Now, we should be ready to join our data sets using `sitem` as the key. We should have a
2,898 by (10 + 8) = 2,898 by 18 data frame. We will also check our raw data against the
performance data reported by the state in the item report by calculating `percent_earned`
by Rising Tide students and comparing it to the figure `RT Percent Points` and storing the
difference in `earned_diff`
```{r}
SG9_StudentItem <- SG9_MCAS_2022 %>%
left_join(SG9_Item, "sitem")
head(SG9_StudentItem)
SG9_StudentItem
SG9_StudentItem%>%
group_by(sitem)%>%
summarise(percent_earned = round(sum(sitem_score, na.rm=TRUE)/sum(`item Possible Points`, na.rm=TRUE),2) )%>%
left_join(SG9_Item, "sitem")%>%
mutate(earned_diff = percent_earned-`RT Percent Points`)
```
As expected, we now have a 2,898 X 18 data frame and the `earned_diff` values all
round to 0.
:::
## G9 Science Performance Analysis
Now we can examine the content of the exam itself and our students' performance relative to the state.
::: panel-tabset
### Structure of the Exam
<style>
div.blue { background-color:#e6f0ff; border-radius: 5px; padding: 20px;}
</style>
<div class = "blue">
What reporting categories were emphasized by the state?
</div>
We can see from our summary that 50% of the exam points (30 of the available 60) come from questions from the Motion and Forces `Reporting Category`, followed by 30% from Energy, and 20% from Waves.
```{r}
SG9_Cat_Total<-SG9_Item%>%
select(`sitem`, `item Possible Points`, `Reporting Category`)%>%
group_by(`Reporting Category`)%>%
summarise(available_points = sum(`item Possible Points`, na.rm=TRUE))%>%
mutate(percent_available_points = available_points/(sum(available_points, na.rm = TRUE)))
SG9_Cat_Total
```
```{r}
ggplot(SG9_Cat_Total, aes(x='',fill = `Reporting Category`, y = `available_points`)) +
geom_bar(position="fill", stat = "identity") + coord_flip()+
labs(subtitle ="All Students" ,
y = "% Points Available",
x= "Reporting Category",
title = "Percentage of Exam Points Available by Reporting Category",
caption = "2022 HS Introductory Physics MCAS")+
theme(axis.text.x=element_text(angle=60,hjust=1))
```
### Performance by Content Strands
<style>
div.blue { background-color:#e6f0ff; border-radius: 5px; padding: 20px;}
</style>
<div class = "blue">
Where did Rising Tide students lose most of their points?
</div>
The proportion of points lost by Rising Tide students corresponds to the proportion of
points available for each `Reporting Category` of the the exam. This suggests that our
students are prepared consistently across the units in the `Reporting Categories`.
```{r}
SG9_Cat_Loss<-SG9_StudentItem%>%
select(`sitem`, `Reporting Category`, `item Possible Points`, `sitem_score`)%>%
group_by(`Reporting Category`)%>%
summarise(sum_points_lost = sum(`item Possible Points`-`sitem_score`, na.rm=TRUE),
available_points = sum(`item Possible Points`, na.rm=TRUE))%>%
mutate(percent_points_lost = round(sum_points_lost/sum(sum_points_lost,na.rm=TRUE),2))%>%
mutate(percent_available_points = available_points/(sum(available_points, na.rm = TRUE)))
SG9_Cat_Loss<-SG9_Cat_Loss%>%
select(`Reporting Category`, `percent_available_points`, `percent_points_lost`)
SG9_Cat_Loss
```
```{r}
SG9_Percent_Loss<-SG9_StudentItem%>%
select(`sitem`, `Reporting Category`, `item Possible Points`, `sitem_score`)%>%
mutate(`points_lost` = `item Possible Points` - `sitem_score`)%>%
#ggplot(df, aes(x='', fill=option)) + geom_bar(position = "fill")
ggplot( aes(x='',fill = `Reporting Category`, y = `points_lost`)) +
geom_bar(position="fill", stat = "identity") + coord_flip()+
labs(subtitle ="All Rising Tide Students" ,
y = "% Points Loints",
x= "Reporting Category",
title = "Percentage of Points Lost by Reporting Category",
caption = "2022 HS Introductory Physics MCAS")+
theme(axis.text.x=element_text(angle=60,hjust=1))
SG9_Percent_Loss
```
<style>
div.blue { background-color:#e6f0ff; border-radius: 5px; padding: 20px;}
</style>
<div class = "blue">
Did Rising Tide students' performance relative to the state vary by content reporting categories?
</div>
We can see from our table that on average our students earned between 4 and 5 percent fewer of the available points relative to their peers in the state for items in each of the three `reporting Categories`.
```{r}
SG9_Cat_RTState<-SG9_Item%>%
select(`sitem`, `item Possible Points`, `Reporting Category`, `State Percent Points`, `RT Percent Points`, `RT-State Diff`)%>%
group_by(`Reporting Category`)%>%
summarise(available_points = sum(`item Possible Points`, na.rm=TRUE),
RT_points = sum(`RT Percent Points`*`item Possible Points`, na.rm = TRUE),
RT_Percent_Points = 100*round(RT_points/available_points,2),
State_Percent_Points = 100*round(sum(`State Percent Points`*`item Possible Points`/available_points, na.rm = TRUE),2))%>%
mutate(`RT-State Diff` = round(RT_Percent_Points - State_Percent_Points, 2))%>%
ggplot( aes(fill = `Reporting Category`, y=`RT-State Diff`, x=`Reporting Category`)) +
geom_bar(position="dodge", stat="identity") +
labs(subtitle ="All Students" ,
y = "RT-State Diff",
x= "Reporting Category",
title = "Difference in RT vs State Percent Points Earned by Reporting Category",
caption = "2022 HS Introductory Physics MCAS")+
theme(axis.text.x=element_text(angle=60,hjust=1))+
geom_text(aes(label = `RT-State Diff`), vjust = -1., colour = "white", position = position_dodge(.9))
SG9_Cat_RTState
```
Here we see the distribution of `RT-State Diff` (difference between the percentage of points earned on a given item by Rising Tide students and percentage of points earned on the same item by their peers in the State) by `sitem` and content `Reporting Category`. We can see generally that items in the Motion and Forces `Reporting Category` seems to display the most concerning variability in student performance relative to the state. It would be worth looking at the specific question strands with the Physics Teachers. (It would be helpful to add item labels to the dots using `ggplotly`, however I did not find a way to have that render on the class blog)
```{r}
SG9_Cat_Box <-SG9_Item%>%
select(`sitem`, `Reporting Category`, `State Percent Points`, `RT-State Diff`)%>%
group_by(`Reporting Category`)%>%
ggplot( aes(x=`Reporting Category`, y=`RT-State Diff`, fill=`Reporting Category`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.1, alpha=0.9) +
theme_ipsum() +
theme(
legend.position="none",
plot.title = element_text(size=11)
) +
ggtitle("G9 Introductory Physics School State Difference by Item") +
xlab("")
SG9_Cat_Box
#ggplotly(SG9_Cat_Box)
```
### Student Performance by Item Difficulty
<style>
div.blue { background-color:#e6f0ff; border-radius: 5px; padding: 20px;}
</style>
<div class = "blue">
Can differences in Rising Tide student performance on an item and State performance
on an item be explained by the difficulty level of an item?
</div>
When considering `RT-State Diff` against `State Percent Points` for each `sitem` on the MCAS, this does not seem to generally be the case. Although the regression line shows `RT-State Diff` more likely to
be negative on items where students in the State earned fewer points, the p-value is not significant.
```{r}
G9Sci_Diff_Dot<-SG9_Item%>%
select(`State Percent Points`, `RT-State Diff`, `Reporting Category`)%>%
ggplot( aes(x=`State Percent Points`, y=`RT-State Diff`)) +
geom_point(size = 1, color="#69b3a2")+
geom_smooth(method="lm",color="grey", size =.5 )+
labs(title = "RT-State Diff by Difficulty Level", y = "RT-State Diff",
x = "State Percent Points") +
stat_cor(method = "pearson")#+facet(vars(`Reporting Category`)) +#label.x = 450, label.y = 550)
G9Sci_Diff_Dot
```
### Student Performance Key Words
<style>
div.blue { background-color:#e6f0ff; border-radius: 5px; padding: 20px;}
</style>
<div class = "blue">
How did students perform based on key words?
</div>
When scanning the `item Desc` entries in the `SG9_Item` data frame, there are several questions containing the word "Calculate" in their description.
<style>
div.blue { background-color:#e6f0ff; border-radius: 5px; padding: 20px;}
</style>
<div class = "blue">
How much is calculation emphasized on this exam and how did Rising Tide students perform relative to their peers in the state on items containing "calculate" in their description?
</div>
```{r}
SG9_Calc<-SG9_Item%>%
select(`sitem`, `item Desc`,`item Possible Points`, `Reporting Category`, `State Percent Points`, `RT-State Diff`)%>%
mutate( key_word = case_when(
!str_detect(`item Desc`, "calculate|Calculate") ~ "Non-Calc",
str_detect(`item Desc`, "calculate|Calculate") ~ "Calc"))
#view(SG9_Calc)
SG9_Calc
```
Now, we can see that by the Waves and Energy categories half of the available points come
from questions with calculate and half do not. In the Motion and Forces category, 40% of points are associated with questions that ask students to "calculate".
```{r}
SG9_Calc%>%
group_by(`Reporting Category`, `key_word`)%>%
summarise(avg_RT_State_Diff = mean(`RT-State Diff`, na.rm=TRUE),
med_RT_State_Diff = median(`RT-State Diff`, na.rm =TRUE),
#sum_RT_State_Diff = sum(`RT-State Diff`, na.rm=TRUE),
sum_sitem_Possible_Points = sum(`item Possible Points`, na.rm = TRUE))
```
```{r}
SG9_Calc_PointsAvail<-SG9_Calc%>%
group_by(`Reporting Category`, `key_word`)%>%
summarise(avg_RT_State_Diff = mean(`RT-State Diff`, na.rm=TRUE),
med_RT_State_Diff = median(`RT-State Diff`, na.rm =TRUE),
sum_RT_State_Diff = sum(`RT-State Diff`, na.rm=TRUE),
sum_item_Possible_Points = sum(`item Possible Points`, na.rm = TRUE))%>%
ggplot(aes(fill=`key_word`, y=sum_item_Possible_Points, x=`Reporting Category`)) + geom_bar(position="dodge", stat="identity")+
labs(subtitle ="Calculate" ,
y = "Available Points",
x= "Reporting Category",
title = "Available points by Key Word",
caption = "2022 HS Introductory Physics MCAS")+
theme(axis.text.x=element_text(angle=60,hjust=1))+
geom_text(aes(label = `sum_item_Possible_Points`), vjust = 1.5, colour = "white", position = position_dodge(.9))
SG9_Calc_PointsAvail
```
When we compare the median `RT-State Diff` for items containing the word "calculate" in their description vs. items that do not, we can see that across all of the `Reporting Categories` Rising Tide students performed
significantly weaker relative to their peers in the state on questions that asked them to "calculate".
```{r}
SG9_Calc_MedDiffBar<-SG9_Calc%>%
group_by(`Reporting Category`, `key_word`)%>%
summarise(mean_RT_State_Diff = round(mean(`RT-State Diff`, na.rm=TRUE),2),
med_RT_State_Diff = median(`RT-State Diff`, na.rm =TRUE),
sum_RT_State_Diff = sum(`RT-State Diff`, na.rm=TRUE))%>%
ggplot(aes(fill=`key_word`, y=med_RT_State_Diff, x=`Reporting Category`)) + geom_bar(position="dodge", stat="identity") + coord_flip()+
labs(subtitle ="Calculate" ,
y = "Median RT-State-Diff",
x= "Reporting Category",
title = "Median RT-State-Diff by Key Word",
caption = "2022 HS Introductory Physics MCAS")+
theme(axis.text.y=element_text(angle=40,hjust=.5))+
geom_text(aes(label = `med_RT_State_Diff`), hjust = 1, vjust = .75, colour = "black", position = position_dodge(.8))
SG9_Calc_MedDiffBar
```
Here we can see the distribution of `RT-State Diff` by `sitem` and `Reporting Category` and the disparity in `RT-State Diff` when we consider items asking students to "Calculate" vs. those that do not.
```{r}
SG9_Calc_Box <-SG9_Calc%>%
group_by(`key_word`, `Reporting Category`)%>%
ggplot( aes(x=`key_word`, y=`RT-State Diff`, fill=`Reporting Category`)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.1, alpha=0.9) +
theme_ipsum() +
theme(
#legend.position="none",
plot.title = element_text(size=11)
) + labs(subtitle ="Calculate" ,
y = "RT-State-Diff",
x= "Calculate vs. Non-Calculate",
title = "RT-State-Diff by Key Word",
caption = "2022 HS Introductory Physics MCAS")
# ggtitle("RT-State-Diff by Key Word") +
# xlab("")
SG9_Calc_Box
```
<style>
div.blue { background-color:#e6f0ff; border-radius: 5px; padding: 20px;}
</style>
<div class = "blue">
Did RT students perform worse relative to their peers in the state on more "challenging" calculation items?
</div>
If we consider the difficulty of items containing the word `calculate` for students as reflected in the state-wide performance (`State Percent Points`) for a given item, the gap between Rising Tide students' performance to their peers in the state `RT-State Diff` does not seem to increase significantly with the difficulty .
```{r}
#view(SG9_Calc)
SG9_Calc_Dot<- SG9_Calc%>%
select(`State Percent Points`, `RT-State Diff`, `key_word`)%>%
filter(key_word == "Calc")%>%
ggplot( aes(x=`State Percent Points`, y=`RT-State Diff`)) +
geom_point(size = 1, color="#69b3a2")+
geom_smooth(method="lm",color="grey", size =.5 )+
labs(title = "RT State Diff vs. State Percent Points", y = "RT State Diff",
x = "State Percent Points")+
stat_cor(method = "pearson")
SG9_Calc_Dot
```
<style>
div.blue { background-color:#e6f0ff; border-radius: 5px; padding: 20px;}
</style>
<div class = "blue">
Is the "calculation gap" consistent across performance levels?
</div>
Here we can see that students with a higher performance level lost a greater proportion of
their points on questions involving "Calculate". I.e., the higher a student's performance level, the greater the percentage of their points were lost to items asking them to "calculate". This suggests that in the general classroom to raise student performance, students should spend a higher proportion of time on calculation based activities.
```{r}
# G9 Points Lost
G9Sci_StudentCalcPerflev<-SG9_StudentItem%>%
select(gender, sitem, sitem_score, `item Desc`, `item Possible Points`, `State Percent Points`, IEP, `RT-State Diff`, `Reporting Category`, `sperflev`)%>%
mutate( key_word = case_when(
!str_detect(`item Desc`, "calculate|Calculate") ~ "Non-Calc",
str_detect(`item Desc`, "calculate|Calculate") ~ "Calc"))%>%
group_by(`sperflev`, `key_word`)%>%
summarise(total_points_lost = sum(`sitem_score`-`item Possible Points`, na.rm = TRUE),
med_RT_State_Diff = median(`RT-State Diff`, na.rm=TRUE))
G9Sci_StudentCalcPerflev
#view(SG9_StudentItem)
G9Sci_StudentCalcPerflev%>%
ggplot(aes(fill=`key_word`, y=total_points_lost, x=`sperflev`)) + geom_bar(position="fill", stat="identity") +
labs(subtitle ="Calculate" ,
y = "Percentage Points Lost",
x= "Performance Level",
title = "Percentage of Points Lost by Key Word and Performance Level",
caption = "2022 HS Introductory Physics MCAS")
#G9Sci_StudentCalcPerflev
```
### Student Performance and Disability
<style>
div.blue { background-color:#e6f0ff; border-radius: 5px; padding: 20px;}
</style>
<div class = "blue">
Are there differences in the performance of non-disabled and disabled students relative to their academic peers in the state?
</div>
We can see from our `CU306` reports that our students with disabilities performed
better relative to their peers in the state, `RT-State Diff`, across all `Reporting Categories`, while our non-disabled students performed worse relative to their peers
in the state across all `Reporting Categories`. This suggest that more attention needs to be paid to the needs of the non-disabled students in the General Education setting.
```{r}
SG9_CU306Dis%>%
select(`RT-State Diff`, `Reporting Category`)%>%
mutate(`Disability Satus` = "Disabled")
SG9_CU306NonDis%>%
select(`RT-State Diff`, `Reporting Category`)%>%
mutate(`Disability Satus` = "Non-Disabled")
```
When we examine the points lost by reporting category and disability status, there
does not seem to be a significant difference in performance between disabled and non-disabled students across `Reporting Categories`.
```{r}
G9Sci_StudentCalcDis<-SG9_StudentItem%>%
select(gender, sitem, sitem_score, `item Desc`, `item Possible Points`, `State Percent Points`, IEP, `RT-State Diff`, `Reporting Category`, `sperflev`)%>%
mutate( key_word = case_when(
!str_detect(`item Desc`, "calculate|Calculate") ~ "Non-Calc",
str_detect(`item Desc`, "calculate|Calculate") ~ "Calc"))%>%
group_by(`Reporting Category`, `key_word`, `IEP`)%>%
summarise(total_points_lost = sum(`sitem_score`-`item Possible Points`, na.rm = TRUE))%>%
ggplot(aes(fill=`key_word`, y=total_points_lost, x=`Reporting Category`)) + geom_bar(position="dodge", stat="identity")+
facet_wrap(vars(IEP))+ coord_flip()+
labs(subtitle ="Calculate" ,
y = "Sum Points Lost",
x= "Reporting Category",
title = "Sum Points Lost by Key Word Non-Disabled vs. Disabled",
caption = "2022 HS Introductory Physics MCAS")+
geom_text(aes(label = `total_points_lost`), vjust = 1.5, colour = "black", position = position_dodge(.95))
#G9Sci_StudentCalcDis
```
```{r}
G9Sci_StudentCalcDis<-SG9_StudentItem%>%
select(gender, sitem, sitem_score, `item Desc`, `item Possible Points`, `State Percent Points`, IEP, `RT-State Diff`, `Reporting Category`, `sperflev`)%>%
mutate( key_word = case_when(
!str_detect(`item Desc`, "calculate|Calculate") ~ "Non-Calc",
str_detect(`item Desc`, "calculate|Calculate") ~ "Calc"))%>%
group_by(`Reporting Category`, `key_word`, `IEP`)%>%
summarise(sum_points_lost = sum(`sitem_score`-`item Possible Points`, na.rm = TRUE))%>%
ggplot(aes(fill=`key_word`, y=sum_points_lost, x=`Reporting Category`)) + geom_bar(position="fill", stat="identity")+
facet_wrap(vars(IEP))+ coord_flip()+
labs(subtitle ="Calculate" ,
y = "Percent Points Lost",
x= "Reporting Category",
title = "Percent Points Lost by Key Word and Disability Status",
caption = "2022 HS Introductory Physics MCAS")
G9Sci_StudentCalcDis
```
:::
## Conclusion
A student's performance on their 9th Grade Introductory Physics MCAS is strongly associated with their performance on their 8th Grade Math MCAS exam. This suggests that the use of prior Math MCAS and current STAR Math testing data can identify students in need of extra support.
```{r}
SG9_Math<-MCAS_2022%>%
select(sscaleds, mscaleds2021,sscaleds_prior, grade, sattempt)%>%
filter((grade == 9) & sattempt != "N")%>%
ggplot(aes(x=`mscaleds2021`, y =`sscaleds`))+
geom_point(size = 1, color="#69b3a2")+
geom_smooth(method="lm",color="grey", size =.5 )+
labs(title = "2022 HS Introductory Physics vs. 2021 Math MCAS", y = "Physics Scaled Score",
x = "Math Scaled Score") +
stat_cor(method = "pearson", label.x = 450, label.y = 550)
SG9_Math
```
Rising Tide students as a whole performed slightly weaker relative to the state in all content reporting areas; however, students classified as disabled performed better relative to their peers in the state. The performance gap between Rising Tide students and students in the state on the HS Introductroy Physics exam is accounted for by the performance of the non-disabled students in the general classroom setting.
All Rising Tide students, regardless of disability status, performed significantly weaker relative to students in the State on items including the key word "Calculate" in their `item description`. This suggests that we should dedicate more classroom instructional time to problem solving with calculation. Notably, the higher a student's performance level, the higher the percentage of points a student lost for calculation items. The largest area of growth for students across all performance categories is on calculation based items; evidence based math interventions include small group, differentiated problem sets.
The discrepancy in performance by Rising Tide students with and without disabilities relative to their associated academic peers in the state, suggest that our non-disabled students would benefit from some of the practices and supports currently provided to our students on IEPs. Differentiated, tiered, small group problem sets in the general classroom setting could potentially address the "calculation gap".
## Reflection: Limitations/Areas for Improvement
I was inspired to work on this report after years of experience working at a public school. Public education is a sector that is filled with passion and positive intentions but also divisive discussions. There exist a plethora of simplistic "one-trick fixes" that are marketed to students, teachers, and families. The use of data is the best tool we have against pressing forward and investing our precious time and money with initiatives that do not improve student outcomes.
Over the years, I've noticed that teachers and leaders are given annual data reports yet, most lack the time, capacity, or resources to identify evidence based, actionable measures to enact in the classroom or at the organizational level. When presented with all of the questions from an assessment individually and the performance of all of one's students on paper, it is difficult to identify trends. Anecdotally, I have noticed every year the majority of teachers gravitating to the scores and performance of individual students that they previously taught and ascribing mistakes or successes to specific experiences with an individual or one word in a question prompt. While relationship building and teaching to a child are hallmarks to student-teacher relationships, a narrow lens like this will not allow a teacher to identify classroom level changes or curriculum level changes that could impact all students and future students. In one's compassionate focus on individuals, a great opportunity to promote the learning for all students is lost.
With the use of R, and the MCAS reports, I decided to focus on ways to identify trends at the classroom or curricular level. I found it challenging to limit the scope of my work for this project. Also, I struggled with discerning when to use `sum` vs. when to use `averages` or `medians`. To improve a student's performance on a test, we are concerned with total points lost and relative weight of a content category; to identify curricular weaknesses we are also interested in relative performance to the state by content area.
I only completed the analysis of the Introductory Physics Exam for High school students. I have `ELA`, `Math`, and `Science` results for grades 5-8 as well as grade 10. I am still working on building a general function library to generate similar graphics and tables for other content areas and grade levels and I would like to complete a similar report for each grade level and subject area assessment for teachers to use.
Given access to historical data, I think it would be beneficial to examine these trends over time to discern the performance gaps attributable to changes in the population of students (a factor which we cannot control or change) vs. those attributable to curriculum and teaching (an area we can influence and effect change).
I also have access to reports that include the teacher a student had and the grades they earned from their teacher in the year they were assessed on the MCAS. I would like to examine the relationship between a student's performance as measured by their teachers compared to their performance level as measured by the state. Are their patterns to the groups of students with the largest discrepancy between these two metrics? This would be important data to support the teaching and learning at our school.
On a broader scale, I think that I need to develop a stronger sense for what summary statistics are the most meaningful for a given variable to identify potential trends or insights and subsequently what visualizations best convey these insights to a reader. I would also like to develop a tool-kit of best practices for "checking against my own biases". What set of metrics can I perform to best control for my potential mistakes as a human being with a limited perspective?
::: callout-note
I did not cite the source for the MCAS Preliminary Results because it is not a publicly available data set as it contains students' personal information. I did use the raw csv. file retrievable from the DESE portal title "MCAS Full Preliminary Results".
:::
## References
Chang, W. (2022). *R Graphics Cookbook, 2nd Edition*. O'Reilly Media.
Grolemund, G., & Wickham, H. (2016). *R for Data Science: Import, Tidy, Transform, Visualize, and Model Data*. O'Reilly Media.
HighSchool Introductory Physics Item Report \[Data\] [https://profiles.doe.mass.edu/mcas/mcasitems2.aspx?grade=HS&subjectcode=PHY&linkid=23&orgcode=04830000&fycode=2022&orgtypecode=5&](https://profiles.doe.mass.edu/mcas/mcasitems2.aspx?grade=HS&subjectcode=PHY&linkid=23&orgcode=04830000&fycode=2022&orgtypecode=5&)
H. Wickham. __ggplot2: Elegant Graphics for Data Analysis__.
Springer-Verlag New York, 2009.
Papay, J. P., Mantil, A., McDonough, A., Donahue, K., An, L., & Murnane, R. J. (n.d.). ___Lifting all boats? Accomplishments and Challenges from 20 Years of Education Reform in Massachusetts___. Retrieved December 2, 2022, from [https://annenberg.brown.edu/sites/default/files/LiftingAllBoats_FINAL.pdf](https://annenberg.brown.edu/sites/default/files/LiftingAllBoats_FINAL.pdf)
R Core Team. (2020). *R: A language and environment for statistical computing*. R Foundation for Statistical Computing, Vienna, Austria.<https://www.r-project.org>.
RStudio Team. (2019). *RStudio: Integrated Development for R*. RStudio, Inc., Boston, MA. <https://www.rstudio.com>.
Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G,
Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K,
Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K,
Yutani H (2019). “Welcome to the tidyverse.” _Journal of Open Source Software_,
*4*(43), 1686. doi:10.21105/joss.01686 <https://doi.org/10.21105/joss.01686>.
## Appendix
::: panel-tabset
### Codebook MCAS_2022 Variables
For more information about the MCAS, see the Department of Elementary and Secondary Education's [(DESE)](https://www.doe.mass.edu/mcas/results.html) page.
| variable | Measurement Level| Values|
| ------------- |------------|-----------------------------|
| `gender` | Nominal | the reported gender identify of the student. Female: F, Male: M, Non-binary: N|
| `item Description` | Nominal | details of assessment question|
| `item Possible Points` | Discrete | The number of points available for a given `sitem`|
| `Reporting Category` | Nominal | content area of `sitem`|
| | | Motion and Forces |
| | | Waves |
| | | Energy |
|`RT Percent Points` | Continuous | Percent of points earned by Rising Tide Students for a given `sitem`|
| `RT-State Diff` | Discrete | Difference between percent of points earned by Rising Tide Students and Students in the State for a given `sitem`|
| `sitem` | Nominal | The question number the MCAS exam|
| `sitem_score` | Discrete | The number of points a student earned on a given `sitem`|
| `sperflev` | Ordinal | The student's [performance level](https://www.doe.mass.edu/mcas/tdd/pld/) |
| | | Exceeds Expectations |
| | | Meets Expectations |
| | | Partially Meets Expectations|
| | | Does Not Meet Expectations |
| `sscaleds` | Discrete | The [student's scaled score](https://www.doe.mass.edu/mcas/parents/pgreport/ghs-english.pdf ) by subject area (e: English, m: Math, s: Science)|
| `ssgp` | Continuous | The [student's growth percentile](https://www.doe.mass.edu/mcas/growth/default.html) by subject area (e: English, m: Math, s: Science)|
|`State Percent Points` | Continuous | Percent of points earned by Massachusetts students for a given `sitem`|
### MCAS 2022 Data Summary
```{r}
# examine the summary to decide how to best set up our data frame
print(summarytools::dfSummary(MCAS_2022,
varnumbers = FALSE,
plain.ascii = FALSE,
style = "grid",
graph.magnif = 0.70,
valid.col = FALSE),
method = 'render',
table.classes = 'table-condensed')
```
:::