Homelessness is a complex living situation with several qualifying conditions; at its simplest state, the U.S Dept. of Housing and Urban Development defines it as lacking a fixed, regular nighttime residence (not a shelter) or having a nighttime residence not designed for human accommodation1.
On a single night in 2020, over 500,0002 people experienced homelessness in the United States. Florida, the third largest state by population , had the fourth largest homeless population of 2020 with 27,4872.
Florida counties represent a wide range of demographic profiles; the state is a hub to a variety of industries including tourism, defense, agriculture, and information technology. Investigating homelessness in Florida counties with robust data can lead to several conclusions about who is being impacted where, and how state policy is failing (or aiding) groups of a diverse population.
Carole Zugazaga’s 2004 study of 54 single homeless men, 54 single homeless women, and 54 homeless women with children in the Central Florida area investigated stressful life events common among homeless people. Interview responses revealed that women were more likely to have been sexually or physically assaulted, while men were more likely to have been incarcerated or abuse drugs/alcohol. Homeless women with children were more likely to be in foster care as a youth.
Nearly a decade later, county-level data can be used to investigate the relationship between Zugazaga’s reported stressful life events (incarceration, drug arrests, poverty, forcible sex, foster care)3 and homelessness rates.
Research Question
Do particular life stressors increase a population’s vulnerability to homelessness?
Homelessness is not a new issue in the United States, yet homeless policy targets elimination via criminalization rather than prevention. A 2019 article from Homeless Voice provides a brief description of homeless policy in major citites across Florida. Despite state and federal governments being aware of the circumstances that increase vulnerability to homelessness for decades, I anticipate at least one Zugazaga’s five stressors to remain significant in a model relating stressors to Florida homelessness counts 2018-2020.
Research Hypothesis
H0: All stressors are insignificant in predicting homelessness counts ( Bi = 0 for i=0,1,2,…n )
HA: At least one stressor Bi is significant in predicting homelessness counts
The data florida_1820.csv describes population, homelessness counts, poverty counts and several other demographic indicators3 at the county level from 2018 to 2020. All 67 Florida counties have observations for the 3 years giving us 201 observations of 15 variables. Each observation provides a count for all variables from a single county for a year. The variables closely mirror the stressors mentioned in Zugazaga’s study, along with supplementary variables in an attempt to completely capture the circumstances.
The data were collected from the Florida Department of Health. Variable names3 were used as search indicators to produce counts for Florida counties. Unfortunately, we cannot accurately analyze the effect of COVID-19 as data is incomplete for the majority of counties in 2021.
Intro to Data
County Year Homeless (Count) Population
Length:201 Min. :2018 Min. : 0.0 Min. : 8367
Class :character 1st Qu.:2018 1st Qu.: 11.0 1st Qu.: 28089
Mode :character Median :2019 Median : 151.0 Median : 130642
Mean :2019 Mean : 427.8 Mean : 317746
3rd Qu.:2020 3rd Qu.: 563.0 3rd Qu.: 367471
Max. :2020 Max. :3516.0 Max. :2864600
Unemployment Rate Median Inc Incarceration (Rateper1000) Poverty (Count)
Min. : 2.100 Min. :34583 Min. : 0.60 Min. : 906
1st Qu.: 3.400 1st Qu.:41401 1st Qu.: 2.50 1st Qu.: 4901
Median : 4.000 Median :50640 Median : 3.40 Median : 16210
Mean : 4.697 Mean :51116 Mean : 3.84 Mean : 42922
3rd Qu.: 5.600 3rd Qu.:58093 3rd Qu.: 4.50 3rd Qu.: 46034
Max. :13.500 Max. :83803 Max. :18.60 Max. :482656
Drug Arrests (Count) Relocated (Rate) Sub Abuse Enrollment (Count)
Min. : 13 Min. : 4.689 Min. : 5.0
1st Qu.: 225 1st Qu.:11.244 1st Qu.: 76.0
Median : 729 Median :12.700 Median : 250.0
Mean : 1558 Mean :13.288 Mean : 877.6
3rd Qu.: 1903 3rd Qu.:14.544 3rd Qu.:1030.0
Max. :13038 Max. :22.553 Max. :6272.0
Adult Psych Beds (Count) Severe Housing Problems (Rate) Forcible Sex (Count)
Min. : 0.00 Min. : 9.6 Min. : 0.0
1st Qu.: 0.00 1st Qu.:13.3 1st Qu.: 14.0
Median : 0.00 Median :15.4 Median : 45.0
Mean : 66.26 Mean :15.8 Mean : 170.5
3rd Qu.: 84.00 3rd Qu.:17.3 3rd Qu.: 225.0
Max. :778.00 Max. :29.8 Max. :1408.0
NA's :134
Foster Care (Count)
Min. : 3.0
1st Qu.: 33.0
Median : 153.0
Mean : 326.1
3rd Qu.: 353.0
Max. :2289.0
Expanding Intro to Data exposes summary statistics including mean, range, quantiles, and standard deviation for all 15 variables. The table below the summaries provides arranged figures for basic parameters of interest grouped by county.
ggplot2 is used to visualize important relationships between homeless counts and Zugazaga’s stressors. The Florida counties have been categorized into 4 Regions and 3 Income Levels:
Region
Northwest: Escambia County to Madison County; cities include Pensacola, Panama City Beach, and Tallahassee
North: Hamilton County to Marion County; cities include Jacksonville, Gainseville,and Ocala, St. Augustine
Central: Lake County to Okeechobee County; cities include Orlando, Kissimmee, Tampa, St. Petersburg
South: Sarasota County to Miami-Dade County; cities include Ft. Lauderdale, Ft. Myers, Miami, Boca Raton, West Palm Beach
Income Level
High: Median Income >= 60000
Medium: Median Income >= 40000
Low: Median Income < 40000
a) Homeless Rate by Region
Code
# Plot 1 florida_box <- florida_og_plot %>%mutate(`Homeless (Rate)`=`Homeless (Rate)`*100 )%>%mutate(`Region`=fct_relevel(`Region`, "Northwest", "North", "Central", "South"))%>%ggplot(aes(y=`Homeless (Rate)`, x=`Region`, fill=`Region`)) +geom_boxplot(alpha=0.7)+#Scale of y axisscale_y_continuous(breaks=(seq(0,1.5,by=.25)))+# Dimensions of graphcoord_cartesian(ylim=c(0,1.5)) +coord_flip()+scale_fill_brewer(palette ='Set2')+theme_grey()+theme(legend.position ="none")+labs(title="Florida Homeless Rates", subtitle="2018-2020", x=" ", y="Homeless Rate (%)", caption ="Visualized by Region")florida_box
A look at the distributions of homeless rates across Florida counties displays where the highest rates in the state exists.
The state is generally uniform, with the bulk of each region’s distribution sitting below 0.25% of county populations.
The largest difference is between the distributions of Northwest Florida and South Florida. This can be attributed to the stark contrast in where the population in these regions are living.
South Florida is the most urbanized region in the state, with millions living in Miami, Ft. Lauderdale, and West Palm Beach; Northwest Florida is quite the opposite, with small coastal towns and rural inland towns reminiscent of Southern Alabama or Georgia.
b) Homeless Rate by Income Level
Code
# Plot 2florida_income <- florida_og_plot %>%mutate(`Homeless (Rate)`=`Homeless (Rate)`*100)%>%mutate(`Income Level`=fct_relevel(`Income Level`, "Low", "Medium", "High"))%>%ggplot(aes(y=`Homeless (Rate)`, x=`Region`,fill=`Income Level`, )) +geom_bar(position='dodge', stat='identity')+scale_fill_brewer(palette ='Set2')+theme_grey()+labs(title="Barplot: Income Level x Homeless Rate", subtitle="2018-2020", x="Region", y="Homeless Rate (%)", caption ="Visualized by Income Level")florida_income
The barplot further details the differences mentioned in Plot 1. Regions where the populations are less urbanized show Low Income counties reflecting a higher homeless rate as one would assume.
Once entering South Florida, wealth disparities in urbanized areas breaks this trend -counties with high income now report high homeless rates
c) Homeless Rate x Incarceration Rate per1000
Code
#Plot 3florida_incarc <- florida_og_plot %>%mutate(`Homeless (Rate)`=`Homeless (Rate)`*100,`Incarceration (Rateper1000)`=`Incarceration (Rateper1000)`/10)%>%ggplot(aes(y=`Homeless (Rate)`, x=`Incarceration (Rateper1000)`, color=`Region`, label=`County`)) +geom_point(alpha=0.7)+# Must include because of label aesthetic ggrepel::geom_text_repel(show.legend =FALSE, max.overlaps =15, alpha=0.7,size=2.5, nudge_x =-.05, nudge_y =.05)+scale_fill_brewer(palette ='Set2')+theme_grey()+labs(title="Scatterplot: Incarceration x Homeless Rate", subtitle="2018-2020", x="Incarceration Rate (per100)", y="Homeless Rate (%)", caption ="Visualized by Region")florida_incarc
One of Zugazaga’s male stressors Incarceration Rate is illustrated with Homeless Rate; a positive trend exists relating incarceration rates and homelessness rates across Florida counties.
Many state correctional facilities are located in rural counties, explaining both Baker and Monroe observations’ large influence on the plot.
d) Homeless Rate x Drug Arrest Rate
Code
#Plot 4florida_drug <- florida_og_plot %>%mutate(`Homeless (Rate)`=`Homeless (Rate)`*100,`Drug Arrests (Rate)`=`Drug Arrests (Rate)`*100)%>%ggplot(aes(y=`Homeless (Rate)`, x=`Drug Arrests (Rate)`, color=`Region`, label=`County`)) +geom_point(alpha=0.7)+# Must include because of label aesthetic ggrepel::geom_text_repel(show.legend =FALSE, max.overlaps =15, alpha=0.7,size=2.5, nudge_x =-.05, nudge_y =.05)+scale_fill_brewer(palette ='Set2')+theme_grey()+labs(title="Scatterplot: Drug Arrest Rate x Homeless Rate", subtitle="2018-2020", x="Drug Arrests (Rate)", y="Homeless Rate (%)", caption ="Visualized by Region")florida_drug
A similar positive association is seen comparing the homeless rate of a county with its Drug Arrest (Rate).
The high influence of Northwest Florida is likely due to stricter drug policies held by police in less urbanized areas.
on Assumption of Validity
While over 10 variables are predicting Homeless (Rate) across Florida counties, there are still limitations when attempting to comment on the magnitude of an individual stressor. Stressors influence homelessness by driving those in severe situations out of their home or away from their place of origin. Homeless (Rate) is not an ideal measure of magnitude as the homeless population migrating to escape or avoid certain stressors would result in counties with low stressor values having a higher homeless population; this effect is left unexplained by the following models.
The variable Relocated (Rate) is included as an attempt to control for new movement, however this doesn’t completely capture county-to-county migration.
The most appropriate data to accurately capture county-to-county migration is here via the US Census Bureau. The -In, -Out, -Net... spreadsheet provides totals for each county in the United States and movement to all other US counties; unfortunately, this data is too complex to wrangle into the simple data set florida_1820.csv.
on Assumption of Linearity
Code
# Fit 1: A Linear Regression Model With All Vars# Checking Linearity of variables not supported by our literatureflorida_matrix <- florida_og_rates %>%select(-c('County', 'Year', 'Poverty (Rate)', 'Severe Housing Problems (Rate)','Incarceration (Rateper1000)','Sub Abuse Enrollment (Rate)','Drug Arrests (Rate)','Adult Psych Beds (Rate)','Foster Care (Rate)','Forcible Sex (Rate)' ))%>%pairs()
Code
florida_matrix
NULL
A quick look at stressors with a relationship to homelessness not mentioned in Zugazaga’s study, or those that needed further investigation are shown here to confirm linearity with the response, Homeless (Rate). Checking the bottom row,the associations are weak, but a linear approximation is appropriate.
Linear Regression Models
Fit 1: All Variables (No Transformations)
Code
# Linear relationship appears appropriate for all, possibly attempt log transformation on UE Rate?# Creating A Linear Model with all variables included: No Transformations# County Removed as too many levels; improvement: NWFL, NFL, CFL, SWFL, SOFLO categories?fit1 <- florida_og_rates %>%select(-'County')%>%lm(formula=`Homeless (Rate)`~.)summary(fit1)
Call:
lm(formula = `Homeless (Rate)` ~ ., data = .)
Residuals:
Min 1Q Median 3Q Max
-0.0021639 -0.0008741 -0.0002945 0.0003981 0.0078061
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) -5.014e-03 3.408e-03 -1.471 0.1469
Year NA NA NA NA
`Unemployment Rate` -1.993e-04 4.062e-04 -0.491 0.6257
`Median Inc` 5.022e-08 3.506e-08 1.433 0.1576
`Incarceration (Rateper1000)` 3.479e-05 1.009e-04 0.345 0.7316
`Relocated (Rate)` -6.075e-05 7.624e-05 -0.797 0.4290
`Severe Housing Problems (Rate)` 1.584e-04 7.639e-05 2.074 0.0428 *
`Poverty (Rate)` 4.429e-03 8.195e-03 0.540 0.5911
`Drug Arrests (Rate)` 9.809e-02 7.369e-02 1.331 0.1886
`Sub Abuse Enrollment (Rate)` 4.671e-02 1.374e-01 0.340 0.7351
`Adult Psych Beds (Rate)` 3.177e+00 2.138e+00 1.486 0.1429
`Forcible Sex (Rate)` 3.717e-02 1.093e+00 0.034 0.9730
`Foster Care (Rate)` 6.768e-01 3.664e-01 1.847 0.0701 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.00157 on 55 degrees of freedom
(134 observations deleted due to missingness)
Multiple R-squared: 0.3441, Adjusted R-squared: 0.2129
F-statistic: 2.623 on 11 and 55 DF, p-value: 0.00914
Code
rss1 <-deviance(fit1)print(c('RSS Fit 1', rss1))
[1] "RSS Fit 1" "0.000135542075603397"
The first model predicts Homeless (Rate) using all variables, without any transformations or interactions. This causes 134 observations to removed as they are missing values for Severe Housing Problems (Rate).
Only 1 variable - Severe Housing Problems (Rate) - is deemed significant at alpha = 0.05; those without a star (see output) are deemed inconsequential in predicting Homeless (Rate) by this model.
Effect of Relocated (Rate) is negative, indicating that migration can ‘help reduce’ homelessness by county, as predicted in ‘Assumptions on Validity’ (above)
Looking at the signs and magnitude of the predicted (insignificant) variables, they seem plausible - Increases in variables like Drug Arrests (Rate) or Sub Abuse Enrollment (Rate) increase response Homeless (Rate) substantially.
Sub Abuse Enrollement (Rate) can be interpreted here an an indication of how many people in the area are suffering from addiction/abuse problems, rather than a suggestion that substance abuse programs increase homelessness.
Fit 1: Diagnostics
Fit 1 does a poor job of obeying the assumptions regarding residuals of linear regression.
Residuals vs Fitted shows a negative trend the greater the fitted value is, violating the linearity and independence assumption.
Scale - Location confirms this as the standardized residuals increase in magnitude the greater the fitted value is.
Q-Q Plot shows a deviation from the diagonal, violating the assumption that residuals follow an approximately Normal distribution
There are several points that could be considered outliers due to their residual or leverage value, how greatly they influence the points around them in the model.
Monroe County (130), Hardee County (73), and Columbia County (34) all have large positive residuals, indicating our model greatly under-estimated the number of homeless people in this county.
Baker County (4) has worryingly high leverage, its explanatory values have great influence on the data
All of these outliers represent sparsely populated, rural counties, typically outside of more urbanized areas; hence large values for stressors will command great influence on the model.
Fit 2: All Variables, All Observations (Fill Severe Housing Rate)
In Fit 2, values from Severe Housing Problems (Rate) were filled down to restore all observations for use in the model.
Example: Alachua County has the same Severe Housing Problems (Rate) for 2018-2020
Several key stressors were deemed significant, with Adult Psych Beds (Rate) having the largest magnitude
Stressors from Zugazaga’s study that were found significant by this model include Drug Arrests (Rate) and Foster Care (Rate)
Fit 2: Diagnostics
Using all observations in Fit 2has not improved the diagnostic plots, but including 2019 and 2020 values has revealed new outliers.
2020 produced Liberty County (117) as an outlier, a small, rural county in the Panhandle of Florida.
In late 2018, Hurricane Michael devastated the area; the influence of this observation is likely a direct result of measurements being altered greatly or even unaccounted during 2019 due to the population being “in a transitional state”
Liberty’s records in 2020 will show a vast difference to the incomplete measures of 2019
Fit 3: Random Effects Model - Controlling for County over Time
Oneway (individual) effect Random Effect Model
(Swamy-Arora's transformation)
Call:
plm(formula = Homeless..Rate. ~ Drug.Arrests..Rate. + lag(Sub.Abuse.Enrollment..Rate.,
1) + Forcible.Sex..Rate. + Foster.Care..Rate., data = florida_panel,
model = "random")
Balanced Panel: n = 67, T = 2, N = 134
Effects:
var std.dev share
idiosyncratic 1.848e-07 4.298e-04 0.12
individual 1.350e-06 1.162e-03 0.88
theta: 0.7469
Residuals:
Min. 1st Qu. Median 3rd Qu. Max.
-0.00193151 -0.00022543 -0.00002260 0.00019457 0.00192708
Coefficients:
Estimate Std. Error z-value Pr(>|z|)
(Intercept) 0.00089264 0.00037360 2.3893 0.01688
Drug.Arrests..Rate. 0.09055443 0.02068601 4.3776 1.2e-05
lag(Sub.Abuse.Enrollment..Rate., 1) -0.08876300 0.06165290 -1.4397 0.14995
Forcible.Sex..Rate. -0.32816661 0.33634980 -0.9757 0.32923
Foster.Care..Rate. 0.25476909 0.15898288 1.6025 0.10905
(Intercept) *
Drug.Arrests..Rate. ***
lag(Sub.Abuse.Enrollment..Rate., 1)
Forcible.Sex..Rate.
Foster.Care..Rate.
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Total Sum of Squares: 2.8828e-05
Residual Sum of Squares: 2.4526e-05
R-Squared: 0.14921
Adj. R-Squared: 0.12283
Chisq: 22.6247 on 4 DF, p-value: 0.00015047
Code
#bic4 <- BIC(fit4)#print(c('BIC Fit 4:', bic4))rss4 <-deviance(fit4)print(c('RSS Fit 4', rss4))
[1] "RSS Fit 4" "2.45264002390625e-05"
Again using a random effects model with only stressors mentioned in Zugazaga’s study, we see Drug Arrests (Rate) as a significant predictor of homelessness at the alpha = 0.05 level.
Forcible Sex may not capture the circumstances suggested in Zugazaga’s study, as an arrest rate for forcible sex crimes captures the perpetrators within a county, rather than the victims.
Many sex crimes go unreported or unpunished, due to familial or power relationships between the victim and offender
Foster Care and Sub Abuse Enrollement (Rate) have signs and values that correspond well with Zugazaga’s results
Lagging Sub Abuse Enrollement (Rate) shows an increase of citizens involved in substance abuse programs can lead to a decrease in homelessness in the following year
Comparing Residuals Sum Squared and R2Fit 2, Fit 3, and Fit 4 I would select Fit 3 for inference. Although Fit 4 (with lag) had a lower Residual Sum Squared value, I appreciate the completeness of Fit 3 and believe the extra variables provide a better picture of how stressors impact the homeless population in Florida.
Transforming the data into panel data produces more accurate coefficients, as rather than 201 individual observations, the model considers 3 years of 67 individual observations. This results in smaller standard error. The R^2 is only considered in passing, as the goal of the study is inference not prediction.
Research Question:
All of Zugazaga’s effects had plausible signs demonstrating their influence on homelessness in Fit 3, but only Drug Arrests (Rate) was significant at the 0.05 level as hypothesized. This significance is a comment on the mathematical properties of the model rather than on the real-life effect of the stressors, which all are influential situations that can contribute to homelessness.
Drug Arrests (Rate) positive slope indicated that as the rate of arrests made for drug abuse/possession is in a county increases, so does the homeless rate in the county.
This is a comment on the availability of drugs in Florida counties, and how insufficient addiction treatment can contribute to other socioeconomic issues in a community
Criminalization does not solve the problem, it relocates it; it is likely many returning citizens will be caught in a cycle of drug abuse, incarceration, and homelessness.
Was the Research Question Answered?
As hypothesized, the model proved several stressors to be significant in predicting Homeless Rates across Florida
The significance of Drug Arrests (Rate) in the 3 models using all of the observations allows us to reject H0: No stressors are significant in predicting Homeless (Rate).
Unfortunately, the study is unable to make a substantial comment on which stressors most increased vulnerability to Homelessness, evaluating magnitude. To do this, deeper demographic variables would need to be included, as well as improvements in controlling for stressors as a push factor in homeless migration.
Because of limitations in the data, and the broad scope of the research question, the study isn’t able to make any new comments on the status of homelessness in Florida, it instead confirms the relevance of the stressors to homeless life a decade later.
Prediction vs Inference
The goal of this brief study was to make inferences regarding stressors’ impact on Homelessness (Rate) in Florida.
If prediction was our focus, I would use new 2021 data from FL Charts without the Homeless (Rate) column to test the efficacy of Fit 2 as a predictive tool.
While the data is quick illustration of homelessness in Florida by county, there are improvements that could be made to both data collection and the research question itself to further the study.
Data
Unfortunately, FL Health Charts did not provide demographic breakdown for the homeless population (Age, Sex, Race), which would drastically widen the scope of the analysis, leading to far more interesting conclusions.
There is only have data for a three year period; this is too small of a range to make a strong statement about the impact of homeless policy on Florida counties or how the relevance of certain stressors has changed over time. For a more in depth study I would begin with a 10 year range.
Research Question
Demographic breakdown of stressors’ impact (Age, Sex, Race)
Specifically hone in on certain stressors and their accompanying varibales to view their impact on a population instead of a broad range of stressors
Included life stressors not associated with homelessness for comparison
Extend the question to the entire country, providing a breakdown by state
Compare to foreign countries to contrast governments’ approaches to homelessness and leading causes of homelessness around the world.
County - Florida county(67 total), divided into Northwest Florida, Northeast Florida, Central Florida, and South Florida for visualizations
Population - Yearly population count for county, used as denomintor of all rate variables unless specified.
Year - Years 2018, 2019, 2020 included in this study
Homeless (Rate) - Yearly homeless count of a county divided by county population
Unemployment Rate - The ratio of unemployed to the civilian labor force, expressed as a percent
Median Inc - Median household income is the amount which divides the income distribution into two equal groups
Incarceration Rate per 1000 - Number of incarcerated people per 1000 (within county)
Poverty Rate - Number of people living below poverty line divided by population
Drug Arrests (Rate) - Arrests attributed to possession or sale of illegal drugs divided by population
Relocated (Rate) - The number of people over age 1 who lived in a different county the previous year
Sub Abuse Enrollment (Rate) - The number of beds indicates the number of adults (age 18 and over) who may receive substance abuse treatment on an in-patient basis
Adult Psych Beds (Rate) - When adults psychiatric distress are uninsured, charged with crimes or meet state criteria for civil commitment because they are violent/dangerous to themselves or others, psychiatric beds are where they are admitted for treatment. The number of beds indicates the number of people who may potentially receive adult (age 18 and over) psychiatric care on an in-patient basis. Divided by population
Severe Housing Problems (Rate) - The percentage of households with at least one or more of the following housing problems: lack of kitchen facilities; lack of plumbing facilities; more than 1.5 persons per room, severe cost burden (monthly housing costs including utlities exceed 50% of monthly income).
Forcible Sex (Rate) - Any sexual act or attempt involving force is classified as a forcible sex offense regardless of the age of the victim or the relationship of the victim to the offender, divided by population
Foster Care (Rate) - Foster care provides a safe and stable environment for children when the cannot be with their parents for some reason, divided by population
Zugazaga, Carole, “Pathways to homelessness and social support among homeless single men, single women, and women with children” (2002). Retrospective Theses and Dissertations. 1744.
3.) Explanation of variables and collection method in Codebook tab
Source Code
---title: "DACSS 603: Final Project"subtitle: "Florida Homelessness by County 2018-2020"author: "Dane Shelton"desription: "Data Exploration, Visualizations, Analysis"date: "12/15/2022"format: html: callout-appearance: "simple" callout-icon: FALSE df-print: paged toc: true code-fold: true code-copy: true code-tools: truecategories: - finalpart3 - shelton - homelessness---```{r}#| label: setup#| include: false#| warning: falselibrary(tidyverse)library(stargazer)library(ggrepel)library(plotly)library(ggplot2)library(GGally)library(ggfortify)library(flexmix)library(plm)knitr::opts_chunk$set(echo =FALSE, warning=FALSE, message =FALSE)```## Homelessness in FloridaHomelessness is a complex living situation with several qualifying conditions; at its simplest state, the U.S Dept. of Housing and Urban Development defines it as **lacking a fixed, regular nighttime residence (not a shelter) or having a nighttime residence not designed for human accommodation**^1^.On a single night in 2020, over 500,000^2^ people experienced homelessness in the United States. Florida, the third largest state by population , had the fourth largest homeless population of 2020 with 27,487^2^.Florida counties represent a wide range of demographic profiles; the state is a hub to a variety of industries including tourism, defense, agriculture, and information technology. Investigating homelessness in Florida counties with robust data can lead to several conclusions about *who* is being impacted *where*, and how state policy is failing (or aiding) groups of a diverse population.::: panel-tabset## Research QuestionCarole Zugazaga's 2004 study of 54 single homeless men, 54 single homeless women, and 54 homeless women with children in the Central Florida area investigated stressful life events common among homeless people. Interview responses revealed that women were more likely to have been sexually or physically assaulted, while men were more likely to have been incarcerated or abuse drugs/alcohol. Homeless women with children were more likely to be in foster care as a youth.Nearly a decade later, county-level data can be used to investigate the relationship between Zugazaga's reported stressful life events (incarceration, drug arrests, poverty, forcible sex, foster care)^3^ and homelessness rates.::: callout-note## Research QuestionDo particular life stressors increase a population's vulnerability to homelessness?:::## HypothesisHomelessness is not a new issue in the United States, yet homeless policy targets elimination via criminalization rather than prevention. A 2019 article from [Homeless Voice](https://homelessvoice.org/the-policies-and-laws-of-florida-cities/) provides a brief description of homeless policy in major citites across Florida. Despite state and federal governments being aware of the circumstances that increase vulnerability to homelessness for decades, I anticipate at least one Zugazaga's five stressors to remain significant in a model relating stressors to Florida homelessness counts 2018-2020.::: callout-note## Research Hypothesis**H~0~:** All stressors are insignificant in predicting homelessness counts **(** B~i~ = 0 for i=0,1,2,...n **)****H~A~:** At least one stressor **B~i~** is significant in predicting homelessness counts:::## Introduction to Data```{r}#| label: loading florida_1820#| include: FALSE# This data was cleaned and formatted to a tidy .csv in another .qmd file, the manipulations were messy and probably inefficient (brute force); can upload if neededflorida_og <- readr::read_csv('_data/florida_1820.csv', show_col_types =FALSE)%>%rename('Adult Psych Beds (Count)'='Adult Pysch Beds (Count)')```The data `florida_1820.csv` describes population, homelessness counts, poverty counts and several other demographic indicators^3^ at the county level from 2018 to 2020. All 67 Florida counties have observations for the 3 years giving us 201 observations of 15 variables. Each observation provides a count for all variables from a single county for a year. The variables closely mirror the stressors mentioned in Zugazaga's study, along with supplementary variables in an attempt to completely capture the circumstances.The data were collected from the [Florida Department of Health](https://www.flhealthcharts.gov/charts/default.aspx). Variable names^3^ were used as search indicators to produce counts for Florida counties. Unfortunately, we cannot accurately analyze the effect of COVID-19 as data is incomplete for the majority of counties in 2021.:::{.callout-note collapse="true"}## Intro to Data```{r}#| label: EDA#| output: TRUEhead(florida_og)summary(florida_og)# Changing Counts to Rates and Excluding Population #Surely there's a better way to do this!florida_og_rates <- florida_og %>%mutate('Homeless (Rate)'=`Homeless (Count)`/`Population`,'Poverty (Rate)'=`Poverty (Count)`/`Population`,'Drug Arrests (Rate)'=`Drug Arrests (Count)`/`Population`,'Sub Abuse Enrollment (Rate)'=`Sub Abuse Enrollment (Count)`/`Population`,'Adult Psych Beds (Rate)'=`Adult Psych Beds (Count)`/`Population`,'Forcible Sex (Rate)'=`Forcible Sex (Count)`/`Population`,'Foster Care (Rate)'=`Foster Care (Count)`/`Population`)%>%select(!contains(c('(Count)','Population')))florida_county <- florida_og %>%group_by(County)florida_county %>%summarize('Mean Population'=mean(Population), 'Mean Homeless'=mean(`Homeless (Count)`),'Avg Homeless Rate'=mean(`Homeless (Count)`)/mean(Population),'Avg Median Income'=mean(`Median Inc`), 'Mean Poverty'=mean(`Poverty (Count)`), 'Avg Poverty Rate'=mean(`Poverty (Count)`)/mean(Population),'Avg Incarceration Rate (per 1000)'=mean(`Incarceration (Rateper1000)`))%>%arrange( desc(`Mean Population`), desc(`Mean Homeless`), desc(`Avg Median Income`))%>%mutate(across(c(2:3, 5:6), round, 0))```:::Expanding **Intro to Data** exposes summary statistics including mean, range, quantiles, and standard deviation for all 15 variables. The table below the summaries provides arranged figures for basic parameters of interest grouped by county.## Visualizations`ggplot2` is used to visualize important relationships between homeless counts and Zugazaga's stressors. The Florida counties have been categorized into 4 `Regions` and 3 `Income Levels`::::callout-note## `Region` - **Northwest**: Escambia County to Madison County; cities include Pensacola, Panama City Beach, and Tallahassee - **North**: Hamilton County to Marion County; cities include Jacksonville, Gainseville,and Ocala, St. Augustine - **Central**: Lake County to Okeechobee County; cities include Orlando, Kissimmee, Tampa, St. Petersburg - **South**: Sarasota County to Miami-Dade County; cities include Ft. Lauderdale, Ft. Myers, Miami, Boca Raton, West Palm Beach::::::callout-note## `Income Level` - **High**: Median Income >= 60000 - **Medium**: Median Income >= 40000 - **Low**: Median Income < 40000:::```{r}#| label: categorize into regions#| echo: false# Categorize by regions... conditional better perhaps?florida_og_plot <- florida_og_rates %>%mutate('Region'=case_when(County =='Escambia'| County =='Santa Rosa'| County =='Okaloosa'| County =='Walton'| County =='Holmes'| County =='Washington'| County =='Bay'| County =='Jackson'| County =='Calhoun'| County =='Gulf'| County =='Gadsden'| County =='Escambia'| County =='Liberty'| County =='Leon'| County =='Wakulla'| County =='Franklin'| County =='Jefferson'| County =='Madison'| County =='Taylor'~'Northwest', County =='Hamilton'| County =='Suwannee'| County =='Lafayette'| County =='Dixie'| County =='Gilchrist'| County =='Union'| County =='Baker'| County =='Columbia'| County =='Nassau'| County =='Levy'| County =='Bradford'| County =='Alachua'| County =='Nassau'| County =='Duval'| County =='Putnam'| County =='Marion'| County =='Volusia'| County =='Flagler'| County =='Citrus'| County =='Clay'| County =='St. Johns'~'North', County =='Lake'| County =='Sumter'| County =='Seminole'| County =='Orange'| County =='Hernando'| County =='Pasco'| County =='Brevard'| County =='Indian River'| County =='Pinellas'| County =='Hillsborough'| County =='Polk'| County =='Osceola'| County =='Hardee'| County =='Manatee'| County =='Okeechobee'| County =='Highlands'~'Central', County =='St. Lucie'| County =='Sarasota'| County =='Martin'| County =='Palm Beach'| County =='Collier'| County =='Broward'| County =='Lee'| County =='DeSoto'| County =='Charlotte'| County =='Hendry'| County =='Monroe'| County =='Miami-Dade'| County =='Glades'| County =='Hendry'~'South'))# Categorize by Median Income Levelflorida_og_plot <- florida_og_plot %>%mutate('Income Level'=case_when(`Median Inc`>=60000~'High',`Median Inc`<60000&`Median Inc`>=40000~'Medium',`Median Inc`<40000~'Low'))```:::{.callout-note collapse="true"}## a) Homeless Rate by Region```{r}#| label: plot1 - boxplot by region#| echo: true#| collapse: true#| output: true# Plot 1 florida_box <- florida_og_plot %>%mutate(`Homeless (Rate)`=`Homeless (Rate)`*100 )%>%mutate(`Region`=fct_relevel(`Region`, "Northwest", "North", "Central", "South"))%>%ggplot(aes(y=`Homeless (Rate)`, x=`Region`, fill=`Region`)) +geom_boxplot(alpha=0.7)+#Scale of y axisscale_y_continuous(breaks=(seq(0,1.5,by=.25)))+# Dimensions of graphcoord_cartesian(ylim=c(0,1.5)) +coord_flip()+scale_fill_brewer(palette ='Set2')+theme_grey()+theme(legend.position ="none")+labs(title="Florida Homeless Rates", subtitle="2018-2020", x=" ", y="Homeless Rate (%)", caption ="Visualized by Region")florida_box```- A look at the distributions of homeless rates across Florida counties displays *where* the highest rates in the state exists.- The state is generally uniform, with the bulk of each region's distribution sitting below 0.25% of county populations. - The largest difference is between the distributions of Northwest Florida and South Florida. This can be attributed to the stark contrast in where the population in these regions are living. - South Florida is the most urbanized region in the state, with millions living in Miami, Ft. Lauderdale, and West Palm Beach; Northwest Florida is quite the opposite, with small coastal towns and rural inland towns reminiscent of Southern Alabama or Georgia.::::::{.callout-note collapse="true"}## b) Homeless Rate by Income Level```{r}#| label: plot2 - homeless x income#| echo: true#| collapse: true#| output: true# Plot 2florida_income <- florida_og_plot %>%mutate(`Homeless (Rate)`=`Homeless (Rate)`*100)%>%mutate(`Income Level`=fct_relevel(`Income Level`, "Low", "Medium", "High"))%>%ggplot(aes(y=`Homeless (Rate)`, x=`Region`,fill=`Income Level`, )) +geom_bar(position='dodge', stat='identity')+scale_fill_brewer(palette ='Set2')+theme_grey()+labs(title="Barplot: Income Level x Homeless Rate", subtitle="2018-2020", x="Region", y="Homeless Rate (%)", caption ="Visualized by Income Level")florida_income```- The barplot further details the differences mentioned in `Plot 1`. Regions where the populations are less urbanized show Low Income counties reflecting a higher homeless rate as one would assume. - Once entering South Florida, wealth disparities in urbanized areas breaks this trend -counties with high income now report high homeless rates::::::{.callout-note collapse="true"}## c) Homeless Rate x Incarceration Rate per1000```{r}#| label: plot3 - scatter incarceration x homeless rate#| echo: true#| collapse: true#| output: true#Plot 3florida_incarc <- florida_og_plot %>%mutate(`Homeless (Rate)`=`Homeless (Rate)`*100,`Incarceration (Rateper1000)`=`Incarceration (Rateper1000)`/10)%>%ggplot(aes(y=`Homeless (Rate)`, x=`Incarceration (Rateper1000)`, color=`Region`, label=`County`)) +geom_point(alpha=0.7)+# Must include because of label aesthetic ggrepel::geom_text_repel(show.legend =FALSE, max.overlaps =15, alpha=0.7,size=2.5, nudge_x =-.05, nudge_y =.05)+scale_fill_brewer(palette ='Set2')+theme_grey()+labs(title="Scatterplot: Incarceration x Homeless Rate", subtitle="2018-2020", x="Incarceration Rate (per100)", y="Homeless Rate (%)", caption ="Visualized by Region")florida_incarc```- One of Zugazaga's male stressors `Incarceration Rate` is illustrated with Homeless Rate; a positive trend exists relating incarceration rates and homelessness rates across Florida counties. - Many state correctional facilities are located in rural counties, explaining both **Baker** and **Monroe** observations' large influence on the plot.::::::{.callout-note collapse="true"}## d) Homeless Rate x Drug Arrest Rate```{r}#| label: plot4 - drug arrest rate x homeless rate#| echo: true#| collapse: true#| output: true#Plot 4florida_drug <- florida_og_plot %>%mutate(`Homeless (Rate)`=`Homeless (Rate)`*100,`Drug Arrests (Rate)`=`Drug Arrests (Rate)`*100)%>%ggplot(aes(y=`Homeless (Rate)`, x=`Drug Arrests (Rate)`, color=`Region`, label=`County`)) +geom_point(alpha=0.7)+# Must include because of label aesthetic ggrepel::geom_text_repel(show.legend =FALSE, max.overlaps =15, alpha=0.7,size=2.5, nudge_x =-.05, nudge_y =.05)+scale_fill_brewer(palette ='Set2')+theme_grey()+labs(title="Scatterplot: Drug Arrest Rate x Homeless Rate", subtitle="2018-2020", x="Drug Arrests (Rate)", y="Homeless Rate (%)", caption ="Visualized by Region")florida_drug```- A similar positive association is seen comparing the homeless rate of a county with its `Drug Arrest (Rate)`. - The high influence of Northwest Florida is likely due to stricter drug policies held by police in less urbanized areas. :::## Regression, Diagnostics, and Model Selection### on Assumption of ValidityWhile over 10 variables are predicting `Homeless (Rate)` across Florida counties, there are still limitations when attempting to comment on the magnitude of an individual stressor. Stressors influence homelessness by driving those in severe situations *out* of their home or *away* from their place of origin. `Homeless (Rate)` is not an ideal measure of magnitude as the homeless population migrating to escape or avoid certain stressors would result in counties with low stressor values having a higher homeless population; this effect is left unexplained by the following models.- The variable `Relocated (Rate)` is included as an attempt to control for new movement, however this doesn't completely capture county-to-county migration.- FL Charts has data that records [Population Who Lived in a Different County One Year Earlier](https://www.flhealthcharts.gov/ChartsDashboards/rdPage.aspx?rdReport=NonVitalIndRateOnly.TenYrsRpt&cid=9759), however with the data spanning 2009-2014, using values recorded 4 years prior to our data isn't desirable either.- The most appropriate data to accurately capture county-to-county migration is [here](https://www.census.gov/data/tables/2019/demo/geographic-mobility/county-to-county-migration-2015-2019.html) via the US Census Bureau. The `-In, -Out, -Net...` spreadsheet provides totals for each county in the United States and movement to all other US counties; unfortunately, this data is too complex to wrangle into the simple data set `florida_1820.csv`.### on Assumption of Linearity ```{r}#| label: fit1 scatter#| output: true#| echo: true# Fit 1: A Linear Regression Model With All Vars# Checking Linearity of variables not supported by our literatureflorida_matrix <- florida_og_rates %>%select(-c('County', 'Year', 'Poverty (Rate)', 'Severe Housing Problems (Rate)','Incarceration (Rateper1000)','Sub Abuse Enrollment (Rate)','Drug Arrests (Rate)','Adult Psych Beds (Rate)','Foster Care (Rate)','Forcible Sex (Rate)' ))%>%pairs()florida_matrix```A quick look at stressors with a relationship to homelessness not mentioned in Zugazaga's study, or those that needed further investigation are shown here to confirm linearity with the response, `Homeless (Rate)`. Checking the bottom row,the associations are weak, but a linear approximation is appropriate. ### Linear Regression Models:::{.callout-note collapse="true"}## `Fit 1`: All Variables (No Transformations)```{r}#| label: fit1 - all variables#| echo: true#| output: true# Linear relationship appears appropriate for all, possibly attempt log transformation on UE Rate?# Creating A Linear Model with all variables included: No Transformations# County Removed as too many levels; improvement: NWFL, NFL, CFL, SWFL, SOFLO categories?fit1 <- florida_og_rates %>%select(-'County')%>%lm(formula=`Homeless (Rate)`~.)summary(fit1)rss1 <-deviance(fit1)print(c('RSS Fit 1', rss1))```- The first model predicts `Homeless (Rate)` using all variables, without any transformations or interactions. This causes 134 observations to removed as they are missing values for `Severe Housing Problems (Rate)`. - Only 1 variable - `Severe Housing Problems (Rate)` - is deemed significant at `alpha = 0.05`; those without a star (see output) are deemed inconsequential in predicting `Homeless (Rate)` by this model. - Effect of `Relocated (Rate)` is negative, indicating that migration can 'help reduce' homelessness by county, as predicted in 'Assumptions on Validity' (above)- Looking at the signs and magnitude of the predicted (insignificant) variables, they seem plausible - Increases in variables like `Drug Arrests (Rate)` or `Sub Abuse Enrollment (Rate)` increase response `Homeless (Rate)` substantially. - `Sub Abuse Enrollement (Rate)` can be interpreted here an an indication of how many people in the area are suffering from addiction/abuse problems, rather than a suggestion that substance abuse programs increase homelessness.::::::{.callout-note collapse="true"}## `Fit 1`: Diagnostics```{r}#|label: diagnostics fit 1#| output: truediag1 <-autoplot(fit1,1:6,ncol=3)diag1# Check 34- Colmbia , 130- Monroe, 73 - Hardee, 4 - Baker```- `Fit 1` does a poor job of obeying the assumptions regarding residuals of linear regression.- `Residuals vs Fitted` shows a negative trend the greater the fitted value is, violating the linearity and independence assumption. - `Scale - Location` confirms this as the standardized residuals increase in magnitude the greater the fitted value is.- `Q-Q Plot` shows a deviation from the diagonal, violating the assumption that residuals follow an approximately Normal distribution - There are several points that could be considered outliers due to their residual or leverage value, how greatly they influence the points around them in the model. - **Monroe County** (130), **Hardee County** (73), and **Columbia County** (34) all have large positive residuals, indicating our model greatly under-estimated the number of homeless people in this county. - **Baker County** (4) has worryingly high leverage, its explanatory values have great influence on the data - All of these outliers represent sparsely populated, rural counties, typically outside of more urbanized areas; hence large values for stressors will command great influence on the model.::::::{.callout-note collapse="true"}## `Fit 2`: All Variables, All Observations (Fill Severe Housing Rate)```{r}#| label: fit2 - all observations#| echo: true#| output: truefit2 <- florida_og_rates %>%select(-c('County','Year'))%>%#mutate(`Unemployment Rate` = log(`Unemployment Rate`))%>%fill('Severe Housing Problems (Rate)', .direction="down")%>%lm(formula=`Homeless (Rate)`~ . -`Median Inc`)summary(fit2)bic2 <-BIC(fit2)print(c('BIC Fit 2:', bic2))rss2 <-deviance(fit2)print(c('RSS Fit 2', rss2))```- In `Fit 2`, values from `Severe Housing Problems (Rate)` were filled down to restore all observations for use in the model. - Example: Alachua County has the same `Severe Housing Problems (Rate)` for 2018-2020- Several key stressors were deemed significant, with `Adult Psych Beds (Rate)` having the largest magnitude - Stressors from Zugazaga's study that were found significant by this model include `Drug Arrests (Rate)` and `Foster Care (Rate)`::::::{.callout-note collapse="true"}## `Fit 2`: Diagnostics```{r}#| label: diagnostics2#| output: truediag2 <-autoplot(fit2,1:6,ncol=3)diag2```- Using all observations in `Fit 2`has not improved the diagnostic plots, but including 2019 and 2020 values has revealed new outliers.- 2020 produced **Liberty County** (117) as an outlier, a small, rural county in the Panhandle of Florida. - In late 2018, Hurricane Michael devastated the area; the influence of this observation is likely a direct result of measurements being altered greatly or even unaccounted during 2019 due to the population being "in a transitional state" - Liberty's records in 2020 will show a vast difference to the incomplete measures of 2019::::::{.callout-note collapse="true"}## `Fit 3`: Random Effects Model - Controlling for County over Time```{r}#| label: fit3 - panel data#| echo: true#| output: true#tranform to panel dataflorida_panel <-pdata.frame(florida_og_rates, index=c('County','Year'))fit3 <- florida_panel %>%plm(formula = Homeless..Rate. ~ Unemployment.Rate + Incarceration..Rateper1000. + Relocated..Rate. + Poverty..Rate. + Drug.Arrests..Rate. + Sub.Abuse.Enrollment..Rate. + Adult.Psych.Beds..Rate. + Forcible.Sex..Rate. + Foster.Care..Rate., model='random')summary(fit3)#bic3 <- BIC(fit3)#print(c('BIC Fit 3:', bic3))rss3 <-deviance(fit3)print(c('RSS Fit 3', rss3))```- Evaluating the model with a random effects model allows us to control for unmeasureable differences between counties. - Each county receives its own intercept, drawn from a collection of possible intercepts- Only variables `Drug Arrests (Rate)` and `Adult Psych Beds (Rate)` retained their significance - `Drug Arrests (Rate)` saw a slight decrease in magnitude whereas `Adult Psych Beds (Rate)` increased. - Both are positive, indicating that increases in either of these rates result in an increase in `Homeless Rate`::::::{.callout-note collapse="true"}## `Fit 4`: Random Effects Model - Zugazaga's variables```{r}#| label: fit4 - panel data - Zugazaga's variables#| echo: true#| output: truefit4 <-plm(formula = Homeless..Rate. ~ Drug.Arrests..Rate. +lag(Sub.Abuse.Enrollment..Rate.,1) + Forcible.Sex..Rate. + Foster.Care..Rate., data = florida_panel,model='random')summary(fit4)#bic4 <- BIC(fit4)#print(c('BIC Fit 4:', bic4))rss4 <-deviance(fit4)print(c('RSS Fit 4', rss4))```- Again using a random effects model with only stressors mentioned in Zugazaga's study, we see `Drug Arrests (Rate)` as a significant predictor of homelessness at the `alpha = 0.05` level.- `Forcible Sex` may not capture the circumstances suggested in Zugazaga's study, as an arrest rate for forcible sex crimes captures the perpetrators within a county, rather than the victims. - Many sex crimes go unreported or unpunished, due to familial or power relationships between the victim and offender- `Foster Care` and `Sub Abuse Enrollement (Rate)` have signs and values that correspond well with Zugazaga's results - Lagging `Sub Abuse Enrollement (Rate)` shows an increase of citizens involved in substance abuse programs can lead to a decrease in homelessness in the following year:::### Model Selection:::{.callout-note collapse="true"}## Stargazer Plot```{r}#| label: stargazer plot#| collapse: true#| output: true#| echo: falsemodels<-list(fit1, fit2, fit3, fit4)stargazer(models,type="text", title ="Homelessness in Florida", dep.var.labels ="Homeless Rate")```:::- Comparing Residuals Sum Squared and R^2^ `Fit 2`, `Fit 3`, and `Fit 4` I would select **Fit 3** for inference. Although `Fit 4` (with lag) had a lower Residual Sum Squared value, I appreciate the completeness of `Fit 3` and believe the extra variables provide a better picture of how stressors impact the homeless population in Florida.Transforming the data into panel data produces more accurate coefficients, as rather than 201 individual observations, the model considers 3 years of 67 individual observations. This results in smaller standard error. The R^2 is only considered in passing, as the goal of the study is inference not prediction.**Research Question:**- All of Zugazaga's effects had plausible signs demonstrating their influence on homelessness in `Fit 3`, but only `Drug Arrests (Rate)` was significant at the `0.05` level as hypothesized. This significance is a comment on the mathematical properties of the model rather than on the real-life effect of the stressors, which all are influential situations that can contribute to homelessness.- `Drug Arrests (Rate)` positive slope indicated that as the rate of arrests made for drug abuse/possession is in a county increases, so does the homeless rate in the county. - This is a comment on the availability of drugs in Florida counties, and how insufficient addiction treatment can contribute to other socioeconomic issues in a community - Criminalization does not solve the problem, it relocates it; it is likely many returning citizens will be caught in a cycle of drug abuse, incarceration, and homelessness.## Conclusions:::callout-note## Was the Research Question Answered?- As hypothesized, the model proved several stressors to be significant in predicting Homeless Rates across Florida - The significance of `Drug Arrests (Rate)` in the 3 models using all of the observations allows us to reject **H~0~:** No stressors are significant in predicting `Homeless (Rate)`.- Unfortunately, the study is unable to make a substantial comment on *which* stressors most increased vulnerability to Homelessness, evaluating magnitude. To do this, deeper demographic variables would need to be included, as well as improvements in controlling for stressors as a *push* factor in homeless migration.- Because of limitations in the data, and the broad scope of the research question, the study isn't able to make any new comments on the status of homelessness in Florida, it instead confirms the relevance of the stressors to homeless life a decade later.:::::: callout-note## Prediction vs Inference- The goal of this brief study was to make inferences regarding stressors' impact on `Homelessness (Rate)` in Florida. - If prediction was our focus, I would use new 2021 data from FL Charts without the `Homeless (Rate)` column to test the efficacy of `Fit 2` as a predictive tool.:::## ImprovementsWhile the data is quick illustration of homelessness in Florida by county, there are improvements that could be made to both data collection and the research question itself to further the study.:::callout-note## Data- Unfortunately, [FL Health Charts](https://www.flhealthcharts.gov/charts/default.aspx) did not provide demographic breakdown for the homeless population (Age, Sex, Race), which would drastically widen the scope of the analysis, leading to far more interesting conclusions.- There is only have data for a three year period; this is too small of a range to make a strong statement about the impact of homeless policy on Florida counties or how the relevance of certain stressors has *changed* over time. For a more in depth study I would begin with a 10 year range.::::::callout-note## Research Question- Demographic breakdown of stressors' impact (Age, Sex, Race)- Specifically hone in on certain stressors and their accompanying varibales to view their impact on a population instead of a broad range of stressors- Included life stressors not associated with homelessness for comparison- Extend the question to the entire country, providing a breakdown by state- Compare to foreign countries to contrast governments' approaches to homelessness and leading causes of homelessness around the world.:::## Codebook- **County** - Florida county(67 total), divided into Northwest Florida, Northeast Florida, Central Florida, and South Florida for visualizations- **Population** - Yearly population count for county, used as denomintor of all `rate` variables unless specified.- **Year** - Years 2018, 2019, 2020 included in this study- **Homeless (Rate)** - Yearly homeless count of a county divided by county population- **Unemployment Rate** - The ratio of unemployed to the civilian labor force, expressed as a percent- **Median Inc** - Median household income is the amount which divides the income distribution into two equal groups- **Incarceration Rate per 1000** - Number of incarcerated people per 1000 (within county)- **Poverty Rate** - Number of people living below poverty line divided by population- **Drug Arrests (Rate)** - Arrests attributed to possession or sale of illegal drugs divided by population- **Relocated (Rate)** - The number of people over age 1 who lived in a different county the previous year- **Sub Abuse Enrollment (Rate)** - The number of beds indicates the number of adults (age 18 and over) who may receive substance abuse treatment on an in-patient basis- **Adult Psych Beds (Rate)** - When adults psychiatric distress are uninsured, charged with crimes or meet state criteria for civil commitment because they are violent/dangerous to themselves or others, psychiatric beds are where they are admitted for treatment. The number of beds indicates the number of people who may potentially receive adult (age 18 and over) psychiatric care on an in-patient basis. Divided by population- **Severe Housing Problems (Rate)** - The percentage of households with at least one or more of the following housing problems: lack of kitchen facilities; lack of plumbing facilities; more than 1.5 persons per room, severe cost burden (monthly housing costs including utlities exceed 50% of monthly income).- **Forcible Sex (Rate)** - Any sexual act or attempt involving force is classified as a forcible sex offense regardless of the age of the victim or the relationship of the victim to the offender, divided by population- **Foster Care (Rate)** - Foster care provides a safe and stable environment for children when the cannot be with their parents for some reason, divided by population## ReferencesZugazaga, Carole, "Pathways to homelessness and social support among homeless single men, single women, and women with children" (2002). Retrospective Theses and Dissertations. 1744. ::: ##### Footnotes~1.) [Homeless Definition](https://www.law.cornell.edu/uscode/text/42/11302)~~2.) [US Interagency Council on Homelessness](https://www.usich.gov/tools-for-action/2020-point-in-time-count/)~~3.) Explanation of variables and collection method in Codebook tab~