Reading in ACS 2019 disability population estimates.
Here, we’ll be looking at data about disabled populations in US counties. Specifically, we’re using Subject Table S1810 from the 2019 1-year population estimates from the American Community Survey, an on-going demographics survey run by the U.S. Census Bureau. This table includes lots of data including county populations and disabled populations across different demographics.1
We’re going to answer the following questions:
Table S1810 is incredibly large, so we’ll pull out the following columns:
Variable | Class | Description |
---|---|---|
County | char (text) |
county name |
State | char (text) |
state name |
cty_ni_pop | dbl (numerical) |
total estimated 2019 county population of noninstitutionalized civilians |
cty_ni_dis_pop | dbl (numerical) |
estimated 2019 county population of disabled, noninstitutionalized civilians |
cty_pct_disabled | dbl (numerical) |
disabled population as a percentage of the total county population |
“Noninstitutionalized civilians” means people who aren’t in the armed forces and don’t live in institutions like prisons, hospitals, or nursing homes.2 These other two groups usually rely on their respective institutions to meet their support and access needs, and they usually have higher disabled populations. Surveys like the ACS are mostly used to plan community resources, so they exclude these groups with the assumption that they won’t be interacting with the communities around them.3
I might have to find more thorough data if I plan to use demographics information in future projects.
Let’s read the data in, free it from an unnecessary row, and put it all in a tibble.
Let’s make sure it’s a tibble of about the expected size.
Done!
Right now, our tibble has a lot of very cool data that we won’t be using, and the column names aren’t human-friendly.
Let’s extract the right columns and give them (marginally) friendlier names. We’ll use dplyr::select
for that.
data <- select(data,
NAME,
cty_ni_pop = S1810_C01_001E,
cty_ni_dis_pop = S1810_C02_001E,
cty_pct_disabled = S1810_C03_001E)
For kicks, let’s separate the “NAME” column into “County” and “State.”
Let’s turn the appropriate columns into numerical values. This is kind of sloppy, but it’s just three columns in a script we probably won’t use again. Famous last words, I know.
data$cty_ni_pop <- as.numeric(data$cty_ni_pop)
data$cty_ni_dis_pop <- as.numeric(data$cty_ni_dis_pop)
data$cty_pct_disabled <- as.numeric(data$cty_pct_disabled)
Here’s what the data looks like now:
County | State | cty_ni_pop | cty_ni_dis_pop | cty_pct_disabled |
---|---|---|---|---|
Baldwin County | Alabama | 220911 | 31901 | 14.4 |
Calhoun County | Alabama | 111075 | 22269 | 20.0 |
Cullman County | Alabama | 82841 | 14480 | 17.5 |
DeKalb County | Alabama | 70392 | 7583 | 10.8 |
Elmore County | Alabama | 75409 | 9707 | 12.9 |
Etowah County | Alabama | 101470 | 15944 | 15.7 |
County | State | cty_ni_pop | cty_ni_dis_pop | cty_pct_disabled |
---|---|---|---|---|
Mayagüez Municipio | Puerto Rico | 71018 | 15705 | 22.1 |
Ponce Municipio | Puerto Rico | 129198 | 28785 | 22.3 |
San Juan Municipio | Puerto Rico | 313915 | 60014 | 19.1 |
Toa Alta Municipio | Puerto Rico | 71897 | 6140 | 8.5 |
Toa Baja Municipio | Puerto Rico | 73735 | 16284 | 22.1 |
Trujillo Alto Municipio | Puerto Rico | 63312 | 15870 | 25.1 |
First, let’s find the average disabled population in a US county, as a percentage of the total population.
mean_pct_disabled <- mean(data$cty_pct_disabled)
mean_pct_disabled
[1] 13.70964
We’ll use dplyr::filter
to answer the questions from the intro.
highest_disabled_pop <- filter(data, (data$cty_ni_dis_pop == max(data$cty_ni_dis_pop)))
kable(highest_disabled_pop)
County | State | cty_ni_pop | cty_ni_dis_pop | cty_pct_disabled |
---|---|---|---|---|
Los Angeles County | California | 9964081 | 984931 | 9.9 |
lowest_disabled_pop <- filter(data, (data$cty_ni_dis_pop == min(data$cty_ni_dis_pop)))
kable(lowest_disabled_pop)
County | State | cty_ni_pop | cty_ni_dis_pop | cty_pct_disabled |
---|---|---|---|---|
Walker County | Texas | 61093 | 4947 | 8.1 |
County | State | cty_ni_pop | cty_ni_dis_pop | cty_pct_disabled |
---|---|---|---|---|
Talladega County | Alabama | 76722 | 20102 | 26.2 |
Walker County | Alabama | 62896 | 17381 | 27.6 |
Charlotte County | Florida | 186002 | 45368 | 24.4 |
Walker County | Georgia | 68199 | 16593 | 24.3 |
Raleigh County | West Virginia | 70907 | 17130 | 24.2 |
Bayamón Municipio | Puerto Rico | 164521 | 43015 | 26.1 |
Caguas Municipio | Puerto Rico | 124149 | 32099 | 25.9 |
Guaynabo Municipio | Puerto Rico | 83119 | 19948 | 24.0 |
Trujillo Alto Municipio | Puerto Rico | 63312 | 15870 | 25.1 |
County | State | cty_ni_pop | cty_ni_dis_pop | cty_pct_disabled |
---|---|---|---|---|
Gwinnett County | Georgia | 930955 | 63740 | 6.8 |
Carver County | Minnesota | 104708 | 6822 | 6.5 |
Fort Bend County | Texas | 806384 | 53265 | 6.6 |
Arlington County | Virginia | 231652 | 14506 | 6.3 |
Loudoun County | Virginia | 411654 | 23713 | 5.8 |
Alexandria city | Virginia | 155298 | 10181 | 6.6 |
None of this tells us anything particularly interesting without looking at some complementary data. I’d be interested in looking at other data from Subject Table S1810 to see if there are correlations with race, overall population size, age, or type of disability. Other data sets are out there with data on poverty, food access, urbanization, and lots of other information, and it’ll be very cool to check out some of that data.
U.S. Census Bureau, 2019 American Community Survey 1-Year Estimates, https://data.census.gov/cedsci/table?t=Disability&tid=ACSST1Y2019.S1810↩︎
U.S. Census Bureau, American Community Survey and Puerto Rico Community Survey 2019 Code List, https://www2.census.gov/programs-surveys/acs/tech_docs/code_lists/2019_ACS_Code_Lists.pdf↩︎
Brault, M. (2008). Disability Status and the Characteristics of People in Group Quarters: A Brief Analysis of Disability Prevalence Among the Civilian Noninstitutionalized and Total Populations in the American Community Survey. U.S. Census Bureau “Working Papers”. https://www.census.gov/library/working-papers/2008/demo/brault-01.html↩︎
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Hallee (2022, Feb. 13). Data Analytics and Computational Social Science: Shaye Hallee - DACSS 601 HW02. Retrieved from https://shayehallee.github.io/coursework/2022-02-12-dacss-601-hw02/
BibTeX citation
@misc{hallee2022shaye, author = {Hallee, Shaye}, title = {Data Analytics and Computational Social Science: Shaye Hallee - DACSS 601 HW02}, url = {https://shayehallee.github.io/coursework/2022-02-12-dacss-601-hw02/}, year = {2022} }