County Year Homeless (Count) Population
Length:201 Min. :2018 Min. : 0.0 Min. : 8367
Class :character 1st Qu.:2018 1st Qu.: 11.0 1st Qu.: 28089
Mode :character Median :2019 Median : 151.0 Median : 130642
Mean :2019 Mean : 427.8 Mean : 317746
3rd Qu.:2020 3rd Qu.: 563.0 3rd Qu.: 367471
Max. :2020 Max. :3516.0 Max. :2864600
Unemployment Rate Median Inc Incarceration (Rateper1000) Poverty (Count)
Min. : 2.100 Min. :34583 Min. : 0.60 Min. : 906
1st Qu.: 3.400 1st Qu.:41401 1st Qu.: 2.50 1st Qu.: 4901
Median : 4.000 Median :50640 Median : 3.40 Median : 16210
Mean : 4.697 Mean :51116 Mean : 3.84 Mean : 42922
3rd Qu.: 5.600 3rd Qu.:58093 3rd Qu.: 4.50 3rd Qu.: 46034
Max. :13.500 Max. :83803 Max. :18.60 Max. :482656
Drug Arrests (Count) Relocated (Rate) Sub Abuse Enrollment (Count)
Min. : 13 Min. : 4.689 Min. : 5.0
1st Qu.: 225 1st Qu.:11.244 1st Qu.: 76.0
Median : 729 Median :12.700 Median : 250.0
Mean : 1558 Mean :13.288 Mean : 877.6
3rd Qu.: 1903 3rd Qu.:14.544 3rd Qu.:1030.0
Max. :13038 Max. :22.553 Max. :6272.0
Adult Psych Beds (Count) Severe Housing Problems (Rate) Forcible Sex (Count)
Min. : 0.00 Min. : 9.6 Min. : 0.0
1st Qu.: 0.00 1st Qu.:13.3 1st Qu.: 14.0
Median : 0.00 Median :15.4 Median : 45.0
Mean : 66.26 Mean :15.8 Mean : 170.5
3rd Qu.: 84.00 3rd Qu.:17.3 3rd Qu.: 225.0
Max. :778.00 Max. :29.8 Max. :1408.0
NA's :134
Foster Care (Count)
Min. : 3.0
1st Qu.: 33.0
Median : 153.0
Mean : 326.1
3rd Qu.: 353.0
Max. :2289.0
DACSS 601: Florida Homelessness
Quantitative Review by County
Homelessness in Florida
florida_1820.csv
contains population figures, homelessness counts, poverty counts and other demographic indicators3 at the county level from 2018 to 2020. All 67 Florida counties have data for the 3 years, resulting in 201 observations of 15 variables. Each observation provides a count (or rate) for all variables from a single county for a year. The variables selected closely mirror the stressors mentioned in the reference study (Zugazaga), along with supplementary variables in an attempt to completely capture the circumstances.
Homelessness is a complex living situation with several qualifying conditions; at its simplest state, the U.S Dept. of Housing and Urban Development defines it as lacking a fixed, regular nighttime residence (not a shelter) or having a nighttime residence not designed for human accommodation1.
On a single night in 2020, over 500,0002 people experienced homelessness in the United States. Florida - the third largest state by population - had the fourth largest homeless population of 2020 with 27,4872.
Florida counties represent a wide range of demographic profiles; the state is a hub to a variety of industries including tourism, defense, agriculture, and information technology. Investigating homelessness in Florida counties with robust data can lead to several conclusions about who is being impacted where, and how state policy is failing (or aiding) groups of a diverse population.
Introduction
Carole Zugazaga’s 2004 study of 54 single homeless men, 54 single homeless women, and 54 homeless women with children in the Central Florida area investigated stressful life events common among homeless people. Interview responses revealed that women were more likely to have been sexually or physically assaulted, while men were more likely to have been incarcerated or abuse drugs/alcohol. Homeless women with children were more likely to be in foster care as a youth.
Nearly a decade later, county-level data can be used to investigate the relationship between Zugazaga’s reported stressful life events (incarceration, drug arrests, poverty, forcible sex, foster care)3 and homelessness rates.
Homelessness is not a new issue in the United States, yet homeless policy targets elimination via criminalization rather than prevention. A 2019 article from Homeless Voice provides a brief description of homeless policy in major citites across Florida. Despite state and federal governments being aware of the circumstances that increase vulnerability to homelessness for decades, I anticipate at least one Zugazaga’s five stressors to remain significant in a model relating stressors to Florida homelessness counts 2018-2020.
The data were collected from the Florida Department of Health. Variable names3 were used as search indicators to produce counts for Florida counties. Unfortunately, we cannot accurately analyze the effect of COVID-19 as data is incomplete for the majority of counties in 2021.
Read in and tidying were done in a separate file. The process began as laborious, but was lightened by the discovery of 10 year tables. Before this discovery, I would enter a variable name into the search bar of Florida Department of Health, select 2018, and download the .xlsx file to my personal data folder. readxl
was used to bring in the file, and mutate
created a date column filled with 2018. Once the data was appropriate, I saved it in the environment under variable2018
. This was repeated for years 2019 and 2020. Then all three tibbles were full _join
by County
to provide a dataset with 201 observations (67 counties by 3 years) and three variables, County
, Year
, and the variable measurement itself . This was written as a .csv back to the same personal data folder.
Once learning that I could draw 10-year tables from the website rather than having to download three individual .xlsx files, the process became smoother. Now, I would only rename excess years as “delete” to get a table of measurements for only 2018, 2019, and 2020. pivot_longer
then moved the years to a self-titled column and transferred the values to a column of my naming. This was written as a .csv.
I then merged all the tables and wrote this as a .csv florida_full
. I completed a sanity check using distinct county names to ensure I had 201 observations of 15 variables as desired.
Expanding Intro to Data exposes summary statistics including mean, range, quantiles, and standard deviation for all 15 variables. The table below the summaries provides arranged figures for basic parameters of interest grouped by county. The data was nearly tidy, with complete observations for each variable aside from one - Severe Housing Issues
rates for 2019 and 2020 (unrecorded) - these are later filled in with 2018’s value for the regression analysis.
Visualizations and Analysis
ggplot2
is used to visualize important relationships between homeless counts and Zugazaga’s stressors. The Florida counties have been categorized into 4 Regions
and 3 Income Levels
:
Code
# Categorize by regions... conditional better perhaps?
<- florida_og_rates %>%
florida_og_plot mutate('Region' = case_when(County == 'Escambia' |
== 'Santa Rosa'|
County == 'Okaloosa'|
County == 'Walton'|
County == 'Holmes'|
County == 'Washington'|
County == 'Bay'|
County == 'Jackson'|
County == 'Calhoun'|
County == 'Gulf'|
County == 'Gadsden'|
County == 'Escambia'|
County == 'Liberty'|
County == 'Leon'|
County == 'Wakulla'|
County == 'Franklin' |
County == 'Jefferson'|
County == 'Madison'|
County == 'Taylor' ~
County 'Northwest',
== 'Hamilton'|
County == 'Suwannee'|
County == 'Lafayette'|
County == 'Dixie'|
County == 'Gilchrist'|
County == 'Union'|
County == 'Baker'|
County == 'Columbia'|
County == 'Nassau'|
County == 'Levy'|
County == 'Bradford'|
County == 'Alachua'|
County == 'Nassau'|
County == 'Duval'|
County == 'Putnam'|
County == 'Marion'|
County == 'Volusia'|
County == 'Flagler'|
County == 'Citrus'|
County == 'Clay'|
County == 'St. Johns' ~
County 'North',
== 'Lake'|
County == 'Sumter'|
County == 'Seminole'|
County == 'Orange'|
County == 'Hernando'|
County == 'Pasco'|
County == 'Brevard'|
County == 'Indian River'|
County == 'Pinellas'|
County == 'Hillsborough'|
County == 'Polk'|
County == 'Osceola'|
County == 'Hardee'|
County == 'Manatee'|
County == 'Okeechobee'|
County == 'Highlands' ~
County 'Central',
== 'St. Lucie'|
County == 'Sarasota'|
County == 'Martin'|
County == 'Palm Beach'|
County == 'Collier'|
County == 'Broward'|
County == 'Lee'|
County == 'DeSoto'|
County == 'Charlotte'|
County == 'Hendry'|
County == 'Monroe'|
County == 'Miami-Dade'|
County == 'Glades'|
County == 'Hendry'
County ~ 'South'))
# Categorize by Median Income Level
<- florida_og_plot %>%
florida_og_plot mutate('Income Level' = case_when(
`Median Inc` >= 60000 ~ 'High',
`Median Inc` < 60000 &
`Median Inc` >= 40000 ~ 'Medium',
`Median Inc` < 40000 ~ 'Low'))
on Assumption of Validity
While over 10 variables are predicting Homeless (Rate)
across Florida counties, there are still limitations when attempting to comment on the magnitude of an individual stressor. Stressors influence homelessness by driving those in severe situations out of their home or away from their place of origin. Homeless (Rate)
is not an ideal measure of magnitude as the homeless population migrating to escape or avoid certain stressors would result in counties with low stressor values having a higher homeless population; this effect is left unexplained by the following models.
The variable
Relocated (Rate)
is included as an attempt to control for new movement, however this doesn’t completely capture county-to-county migration.FL Charts has data that records Population Who Lived in a Different County One Year Earlier, however with the data spanning 2009-2014, using values recorded 4 years prior to our data isn’t desirable either.
The most appropriate data to accurately capture county-to-county migration is here via the US Census Bureau. The
-In, -Out, -Net...
spreadsheet provides totals for each county in the United States and movement to all other US counties; unfortunately, this data is too complex to wrangle into the simple data setflorida_1820.csv
.
on Assumption of Linearity
Code
# Fit 1: A Linear Regression Model With All Vars
# Checking Linearity of variables not supported by our literature
# Correlation Matrix
<- florida_og_rates %>%
florida_matrix select(-c(contains("Count"),
'Year',
'Poverty (Rate)',
'Severe Housing Problems (Rate)',
'Incarceration (Rateper1000)',
'Sub Abuse Enrollment (Rate)',
'Drug Arrests (Rate)',
'Adult Psych Beds (Rate)',
'Foster Care (Rate)',
'Forcible Sex (Rate)' ))%>%
pairs()
Code
florida_matrix
NULL
A quick look at stressors with a relationship to homelessness not mentioned in Zugazaga’s study, or those that needed further investigation are shown here to confirm linearity with the response, Homeless (Rate)
. Checking the bottom row,the associations are weak, but a linear approximation is appropriate.
Linear Regression Models
Model Selection
- Comparing Residuals Sum Squared and R2
Fit 2
,Fit 3
, andFit 4
I would select Fit 3 for inference. AlthoughFit 4
(with lag) had a lower Residual Sum Squared value, I appreciate the completeness ofFit 3
and believe the extra variables provide a better picture of how stressors impact the homeless population in Florida.
Transforming the data into panel data produces more accurate coefficients, as rather than 201 individual observations, the model considers 3 years of 67 individual observations. This results in smaller standard error. The R^2 is only considered in passing, as the goal of the study is inference not prediction.
Research Question:
All of Zugazaga’s effects had plausible signs demonstrating their influence on homelessness in
Fit 3
, but onlyDrug Arrests (Rate)
was significant at the0.05
level as hypothesized. This significance is a comment on the mathematical properties of the model rather than on the real-life effect of the stressors, which all are influential situations that can contribute to homelessness.Drug Arrests (Rate)
positive slope indicated that as the rate of arrests made for drug abuse/possession is in a county increases, so does the homeless rate in the county.This is a comment on the availability of drugs in Florida counties, and how insufficient addiction treatment can contribute to other socioeconomic issues in a community
Criminalization does not solve the problem, it relocates it; it is likely many returning citizens will be caught in a cycle of drug abuse, incarceration, and homelessness.
Reflection
While the data is quick illustration of homelessness in Florida by county, there are improvements that could be made to both data collection and the research question itself to further the study.
References
Chang, W. (2022). R Graphics Cookbook, 2nd Edition. O’Reilly Media.
Grolemund, G., & Wickham, H. (2016). R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly Media.
R Core Team. (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.https://www.r-project.org.
Wickham, H. (2019). Advanced R, Second Edition (2nd ed.). Chapman and Hall/CRC. https://doi.org/10.1201/9781351201315
Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org.
Zugazaga, C. (2004). Stressful life event experiences of homeless adults: A comparison of single men, single women, and women with children. J. Community Psychol., 32: 643-654. https://doi.org/10.1002/jcop.20025
Codebook
County
- Florida county (67 total), divided intoRegion
- Northwest Florida, Northeast Florida, Central Florida, and South Florida for visualizationsPopulation
- Yearly population count for county, used as denomintor of allrate
variables unless specified.Year
- Years 2018, 2019, 2020 included in this studyHomeless (Rate)
- Yearly homeless count of a county divided by county populationUnemployment Rate
- The ratio of unemployed to the civilian labor force, expressed as a percentMedian Inc
- Median household income is the amount which divides the income distribution into two equal groupsIncarceration Rate per 1000
- Number of incarcerated people per 1000 (within county)Poverty Rate
- Number of people living below poverty line divided by populationDrug Arrests (Rate)
- Arrests attributed to possession or sale of illegal drugs divided by populationRelocated (Rate)
- The number of people over age 1 who lived in a different county the previous yearSub Abuse Enrollment (Rate)
- The number of beds indicates the number of adults (age 18 and over) who may receive substance abuse treatment on an in-patient basisAdult Psych Beds (Rate)
- When adults psychiatric distress are uninsured, charged with crimes or meet state criteria for civil commitment because they are violent/dangerous to themselves or others, psychiatric beds are where they are admitted for treatment. The number of beds indicates the number of people who may potentially receive adult (age 18 and over) psychiatric care on an in-patient basis. Divided by populationSevere Housing Problems (Rate)
- The percentage of households with at least one or more of the following housing problems: lack of kitchen facilities; lack of plumbing facilities; more than 1.5 persons per room, severe cost burden (monthly housing costs including utlities exceed 50% of monthly income).Forcible Sex (Rate)
- Any sexual act or attempt involving force is classified as a forcible sex offense regardless of the age of the victim or the relationship of the victim to the offender, divided by populationFoster Care (Rate)
- Foster care provides a safe and stable environment for children when the cannot be with their parents for some reason, divided by population :::
Footnotes
2.) US Interagency Council on Homelessness
3.) Explanation of variables and collection method in Codebook tab