Fort Worth Climate Change

Fort Worth climate change based on precipitation, humidity, and temperature from 1981-2022

Ethan Campbell
2022-05-09

Introduction

Alterations in extreme weather patterns have grown more prevalent in recent years igniting discussion on climate change. Recent events such as the polar vortex in Texas have forced people to reconsider the effect of climate change from a new perspective. Through experiencing such an event, I have begun to question the environmental change within my local realm: Dallas-Fort Worth. Climate change is defined as “a long-term change in the average weather pattern that have come to define Earth’s local, regional and global climates” (NASA). The study will be conducted based on environmental factors and their variations instead of carbon emissions or human interaction. This project is centered around analyzing temperature, humidity, and precipitation to determine whether any significant transitions have occurred. The general expectation from this project is a solution to the question: has the Texas climate been altered from its original course? I hypothesize that there has been a sway in the climate based on certain factors like precipitation. To determine the answer, an analytical analysis was performed to drill down each variable by comparing the outcomes by time.

Data

Information gathering was completed via the NASA Prediction of Worldwide Energy Resource website centered around Fort Worth, Texas. Compressed inside the excel document was information about wind speed at two meters, temperature at two meters in Celsius, specific, and relative humidity at two meters, precipitation millimeters per day, and surface pressure. Due to the brief time frame presented to the research project, some variables were excluded from the study leaving temperature, humidity, and precipitation. The selection of variables was driven by knowledge of variables, accessibility to information, and level of effect on climate. The dataset is formatted by year ranging from 1981 to 2022. This is appropriate for a small-scale level study and shows the impact of climate during my lifetime; however, further dating would allow for more large-scale analysis.

Show code
leaflet() %>%
  addTiles() %>%
  addMarkers(lng=-97.3225, lat=32.756, popup="Fort Worth")

There were 14,976 observations and 8 variables in the first data set labeled Fort_Worth_2022 and 400 observations and 15 variables in the Fort_Worth data set. The rows were generated by the time and occurrence of each variable during this range, while columns were generated by unique variables. Due to the high level of error within the precipitation section, it became necessary to import two data sets. More specifically, the amount of precipitation showed an error range of 35 inches. To ensure a fair study, more accurate information was introduced and merged into the analysis. Cleaning the data can be broken down into three different sections. The first was pivoting the data to introduce and highlight unique and important variables. This enabled a smoother analytical study of information while also improving visualizations. The second section consisted of altering the variables. Within this section, renaming and mutating were the most predominant functions used. Renaming variables removed the confusion with the acronyms applied within the datasheet as well as enhanced the information consumption from an initial glance. The mutated section was the beginning of altering the data to bring forth observations of interest. Lubridate was used to create a date column to remove the year, month, and day columns to enhance workflow.

The creation of annual columns was vital due to the lack of in the FW_Updated table, while also ensuring an analysis by year was applicable to gauge the impacts caused yearly. The third section was merging. After applying all changes to each dataset, the FW_Updated dataset was merged with the annual columns while the Fort_Worth data set was left untouched after pivots were applied. Furthermore, each variable of interest required unique tidying methods and approaches due to variations specific to each variable.

Show code
Fort_Worth_2022 <- read.csv("Fort_Worth_climate_with_day.csv", skip = 14)
Fort_Worth <- read.csv("Fort_Worth_climate.csv", skip = 18)

dim(Fort_Worth_2022)
[1] 14976     8
Show code
dim(Fort_Worth)
[1] 400  15
Show code
colnames(Fort_Worth_2022)
[1] "YEAR"        "MO"          "Day"         "T2M"        
[5] "RH2M"        "PRECTOTCORR" "WS2M"        "PS"         
Show code
colnames(Fort_Worth)
 [1] "PARAMETER" "YEAR"      "JAN"       "FEB"       "MAR"      
 [6] "APR"       "MAY"       "JUN"       "JUL"       "AUG"      
[11] "SEP"       "OCT"       "NOV"       "DEC"       "ANN"      
Show code
Month_combined <- Fort_Worth %>%
pivot_longer(
  cols = c(NOV, JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, DEC),
  names_to = "MONTH",
  values_to = "Month_AVG",
)

Para_split <- Month_combined %>%
  pivot_wider(names_from = PARAMETER,
              values_from = Month_AVG,
              )


FW_Updated <- Fort_Worth_2022 %>%
  dplyr::rename(Temperature = T2M) %>%
  dplyr::rename(Humidity = RH2M) %>%
  dplyr::rename(Wind_Speed = WS2M) %>%
  dplyr::rename(Surface_Pressure = PS)


FW_Updated$Date <- with(FW_Updated, ymd(sprintf('%04d%02d%02d', YEAR, MO, Day)))


YEAR <- format(as.Date(FW_Updated$Date), format = "%Y")
 
Means_variables <- ddply(FW_Updated, .(YEAR), summarise,
      Annual_Temperature = mean(Temperature),
      Annual_Humidity = mean(Humidity),
      Annual_Precipitation = mean(PRECTOTCORR))

Final_FW <- merge(FW_Updated, Means_variables, by = "YEAR")


Final_FW <- transform(Final_FW, MonthAbb = month.abb[MO])

rmarkdown::paged_table(Final_FW)

Temperature cleaning

Microanalysis of each section proved vital as information was spread across variables which prompted the usage of the select function. More specifically, a change from Celsius to Fahrenheit was made. This allowed a more western approach to temperature while benefiting the visualization process of information intake. After mutating these values, an arrange function was utilized to display when the temperature was the highest and lowest. It should be noted these are the mean values for a 24-hour period and are not the stand-alone highest or lowest temperature ever reached in a moment. The data shows that the highest temperature achieved was 100.202 degrees in August of 2011. The lowest temperature achieved was 7.016 degrees in December of 1989. To gather more data pertaining to temperature, descriptive statistics revolving around the central tendency was ran which concluded a mean of 66.8 degrees, a standard deviation of 15, and a median of 67.8. The standard deviation presented as one of the most notable because it reached a substantial number; however, considering the range of temperatures within a year, this standard deviation is within expectation.

Show code
Temperature_Final <- Final_FW %>%
  mutate(Temperature = Temperature * 9/5 + 32) %>%
  mutate(Annual_Temp = Annual_Temperature * 9/5 + 32) %>%
  select(Date, YEAR, MO, Day, MonthAbb, Temperature, Annual_Temp) %>%
  na.omit(Temperature)


# When was Temperature the highest (Aug of 2011)

TF_highest <- Temperature_Final %>%
  select(Date,Temperature, Annual_Temp) %>%
  arrange(desc(Temperature)) %>%
  slice(1:12)

kable(TF_highest, digits = 4, align = "ccccccc", col.names = c("Date", "Temperature", "Annual Temperature"), caption = "Highest Temperature Since 1981") %>%
  kable_styling(font_size = 16) %>%
  row_spec(c(1,1,1), background = "red")
Table 1: Highest Temperature Since 1981
Date Temperature Annual Temperature
2011-08-03 100.202 67.5257
2011-08-02 99.842 67.5257
2011-08-04 99.536 67.5257
2018-07-22 98.384 65.0461
1996-07-07 97.826 64.4697
2000-09-04 97.772 65.7012
2011-08-01 97.700 67.5257
2018-07-21 97.592 65.0461
2018-07-19 97.520 65.0461
2011-08-05 97.376 67.5257
1995-07-28 97.286 64.1117
1999-08-10 97.268 66.8394
Show code
# When was Temperature the lowest (Dec of 1989)

TF_lowest <- Temperature_Final %>%
  select(Date, Temperature, Annual_Temp) %>%
  arrange(Temperature) %>%
  na.omit(Temperature) %>%
  slice(1:12)

kable(TF_lowest, digits = 4, align = "ccccccc", col.names = c("Date", "Temperature", "Annual Temperature"), caption = "Lowest Temperature Since 1981") %>%
  kable_styling(font_size = 16) %>%
  row_spec(c(1,1,1), background = "cadetblue")
Table 1: Lowest Temperature Since 1981
Date Temperature Annual Temperature
1989-12-22 7.016 62.2277
1983-12-24 9.914 62.7198
1989-12-23 11.174 62.2277
2021-02-15 11.714 65.4711
1983-12-25 11.822 62.7198
1983-12-22 13.406 62.7198
2021-02-16 13.766 65.4711
1982-01-11 15.224 63.4894
1985-01-20 16.304 64.0079
1983-12-23 16.412 62.7198
1990-12-23 16.538 65.0338
2021-02-14 16.970 65.4711
Show code
TF_stats <- Temperature_Final %>%
  select(Date, Temperature, Annual_Temp) %>%
  mutate(Mean = mean(Temperature)) %>%
  mutate(Standard_Deviation = sd(Temperature)) %>%
  mutate(Median = median(Temperature)) %>%
  slice(1:12)

kable(TF_stats, digits = 4, align = "ccccccc", col.names = c("Date", "Temperature", "Annual Temperature", "Mean", "Standard Deviation", "Median"), caption = "Statistical Temperature Data") %>%
  kable_styling(font_size = 16) %>%
  row_spec(c(1,1,1,1,1,1))
Table 1: Statistical Temperature Data
Date Temperature Annual Temperature Mean Standard Deviation Median
1981-01-01 46.868 64.4563 64.8343 17.3772 66.281
1981-01-02 41.594 64.4563 64.8343 17.3772 66.281
1981-01-03 44.114 64.4563 64.8343 17.3772 66.281
1981-01-04 34.610 64.4563 64.8343 17.3772 66.281
1981-01-05 36.806 64.4563 64.8343 17.3772 66.281
1981-01-06 44.924 64.4563 64.8343 17.3772 66.281
1981-01-07 39.542 64.4563 64.8343 17.3772 66.281
1981-01-08 43.718 64.4563 64.8343 17.3772 66.281
1981-01-09 46.346 64.4563 64.8343 17.3772 66.281
1981-01-10 44.294 64.4563 64.8343 17.3772 66.281
1981-01-11 39.866 64.4563 64.8343 17.3772 66.281
1981-01-12 31.838 64.4563 64.8343 17.3772 66.281

Temperature percent change

After computing the central tendency using 3 methods, an analysis of percent change per year was prompted. This was completed using a series of functions to create a system that could compute the annual change from the previous year and convert the output into a percentage. The function was later used to compute the difference from 1981 compared to any other year and then transitioned into a percentage. This allowed the user to visually see the percent difference between years and how much that year’s temperature varied compared to the original year (1981). This was completed using the lag and lead function to allow a selection of prior or future values and create the ability to apply them to other values. When comparing each year with time progression, the above function is used but ends at the Percent_temp section. To compare 1981 to another year, the whole function is necessary to analyze these changes. A slice function was incorporated to reduce the number of entries to enhance the visualization process; this made 1 year equal to 3.

Show code
TF <- Temperature_Final %>%
  distinct(YEAR, Annual_Temp)

TF <- TF %>%
  slice(which(row_number() %% 3 == 1))

YearOneprepTemp <- TF[1,c("Annual_Temp")]

Temp_change <- TF %>%
  dplyr::mutate(Previous = lag(Annual_Temp),
                Next_temp = lead(Annual_Temp),
                change_temp = Annual_Temp - Previous,
                Percent_temp = (change_temp/Previous)* 100,
                Percent_change_temp = (change_temp/lag(Annual_Temp) -1) * 100,
                TChange_from_year_one = (Annual_Temp/YearOneprepTemp - 1) * 100)


Temp_change <- Temp_change %>%
  select(YEAR, Annual_Temp, Percent_temp, TChange_from_year_one)

Precipitation

The second subsection was precipitation which utilized the Fort_Worth dataset. A similar tidying method was followed; however, instead of changing from Celsius, a transition from millimeters to inches was made. This was utilized to reduce the strain on the viewer; the perception of data in millimeters is not as commonly understood compared to inches. Progressing forward, the highest and lowest precipitation amounts were determined and recorded down based on monthly means. The highest precipitation in one month is 15.6315 inches in October of 1981. The lowest precipitation in one month is .0098 in January of 1986. The 90s showed a wide range of precipitation as both the maximum and minimum occurred during this decade. Further study went on to show precipitation had a mean of 2.91 inches, a standard deviation of 2.18, and a median of 2.48 inches. A significant reduction in standard deviation is evident compared to temperature which can be contributed to the data being condensed into monthly means.

Show code
Precipitation_Final <- Para_split %>%
  mutate(Precipitation_annual = ANN / 25.4) %>%
  mutate(Precipitation_Monthly = PRECTOTCORR_SUM / 25.4) %>%
  select(YEAR, MONTH, Precipitation_Monthly, Precipitation_annual) %>%
  na.omit(Precipitation_Monthly)


# When was precipitation the highest (OCT of 1981)

PF_highest <- Precipitation_Final %>%
  select(YEAR, MONTH, Precipitation_Monthly, Precipitation_annual) %>%
  arrange(desc(Precipitation_Monthly)) %>%
  na.omit(Precipitation_Monthly) %>%
  slice(1:12)

kable(PF_highest, digits = 4, align = "ccccccc", col.names = c("Year", "Month","Precipiation", "Annual Precipitation"), caption = "Highest Monthly Mean Precipitation since 1981") %>%
  kable_styling(font_size = 16) %>%
  row_spec(c(1,1,1))
Table 2: Highest Monthly Mean Precipitation since 1981
Year Month Precipiation Annual Precipitation
1981 OCT 15.6315 42.6059
2015 MAY 15.2409 57.9571
1982 MAY 11.2780 38.4768
1989 MAY 10.6122 43.0118
2018 OCT 10.4291 40.0685
2004 JUN 10.2287 45.3587
2007 JUN 10.1677 44.7728
1989 JUN 9.6437 43.0118
1990 APR 9.3563 46.7020
1991 DEC 8.7520 47.5465
1991 OCT 8.7441 47.5465
2009 OCT 8.6858 38.4370
Show code
# When was it the lowest (Jan of 1986)

PF_lowest <- Precipitation_Final %>%
  select(YEAR, MONTH, Precipitation_Monthly, Precipitation_annual) %>%
  arrange(Precipitation_Monthly) %>%
  na.omit(Precipitation_Monthly) %>%
  slice(1:12)

kable(PF_lowest, digits = 4, align = "ccccccc", col.names = c("Year", "Month","Precipiation", "Annual Precipitation"), caption = "Lowest Monthly Mean Precipitation since 1981") %>%
  kable_styling(font_size = 16) %>%
  row_spec(c(1,1,1))
Table 2: Lowest Monthly Mean Precipitation since 1981
Year Month Precipiation Annual Precipitation
1986 JAN 0.0098 39.0244
2011 JUL 0.0130 22.5280
2000 AUG 0.0248 32.0547
2011 MAR 0.0787 22.5280
1993 JUL 0.0823 36.9169
2012 NOV 0.1244 28.7287
2018 JAN 0.1705 40.0685
2014 JAN 0.1846 23.9413
2005 NOV 0.1925 18.2807
1981 DEC 0.2059 42.6059
2005 DEC 0.2071 18.2807
1996 FEB 0.2087 33.8114
Show code
PF_stats <- Precipitation_Final %>%
  select(YEAR, MONTH, Precipitation_Monthly, Precipitation_annual) %>%
  mutate(Mean = mean(Precipitation_Monthly)) %>%
  mutate(Standard_Deviation = sd(Precipitation_Monthly)) %>%
  mutate(Median = median(Precipitation_Monthly)) %>%
  slice(1:12)

kable(PF_stats, digits = 4, align = "ccccccc", col.names = c("Year", "Month", "Monthly Precipitation", "Annual Precipitation", "Mean", "Standard Deviation", "Median"), caption = "Statistical Temperature Data") %>%
  kable_styling(font_size = 16) %>%
  row_spec(c(1,1,1,1,1,1,1))
Table 2: Statistical Temperature Data
Year Month Monthly Precipitation Annual Precipitation Mean Standard Deviation Median
1981 NOV 1.5307 42.6059 2.9198 2.182 2.4884
1981 JAN 0.3917 42.6059 2.9198 2.182 2.4884
1981 FEB 1.8244 42.6059 2.9198 2.182 2.4884
1981 MAR 3.3346 42.6059 2.9198 2.182 2.4884
1981 APR 3.0370 42.6059 2.9198 2.182 2.4884
1981 MAY 6.0799 42.6059 2.9198 2.182 2.4884
1981 JUN 4.0516 42.6059 2.9198 2.182 2.4884
1981 JUL 1.4240 42.6059 2.9198 2.182 2.4884
1981 AUG 2.1161 42.6059 2.9198 2.182 2.4884
1981 SEP 2.9783 42.6059 2.9198 2.182 2.4884
1981 OCT 15.6315 42.6059 2.9198 2.182 2.4884
1981 DEC 0.2059 42.6059 2.9198 2.182 2.4884

Precipitation precent change

Utilizing the percent changed function, an analysis was created on humidity and its annual variations. This created information on the annual changes in humidity based on a time scale; however, some problems occurred during this section of the analysis. The problem occurred at the end of the function; however, the reason for this error is still unknown. Other compensatory methods were deployed to ensure a continuous study of each section presented. These methods and problems will be discussed further in the visualization section.

Show code
PF <- Precipitation_Final %>%
  select(YEAR, Precipitation_annual)

PF <- PF %>%
  distinct(YEAR, Precipitation_annual)

PF <- PF %>%
  slice(which(row_number() %% 3 == 1))

Prep_Year_One_prep <- PF[1,c("Precipitation_annual")]

Pct_change <- PF %>%
  dplyr::mutate(Previous = lag(Precipitation_annual),
                Next = lead(Precipitation_annual),
                change = Precipitation_annual - Previous,
                Percent = (change/Previous)* 100,
                Percent_change = (change/lag(Precipitation_annual) -1) * 100)

options(scipen = 999)

Pct_change$Change_from_year_one <- (PF$Precipitation_annual/42.60591 - 1) * 100

Pct_change <- Pct_change %>%
  select(YEAR, Precipitation_annual, Percent, Change_from_year_one)

Humidity

Humidity proved to be the most simplistic of the three variables due to it being in an easily digestible format already. This is because it arrived in percent format and each unique value had its own cell. Once more, the highest and lowest were computed for analysis. This scale returns to the 24-hour analysis. The highest humidity was 97.44% in January of 1998. This was an unexpected observation as humidity is typically lower during winter months due to the lack of moisture in the air. The lowest humidity recorded for one day was 20% in August of 2000. The mean was calculated at 66.8% followed by a standard deviation of 15 and a median of 67.88%. Once more, we noticed an interesting observation with the standard deviation being large again. The reason this is significant is because the correlation between humidity and precipitation was calculated at roughly 77%. However, this did not reflect in the standard deviation potentially because precipitation was broken off into monthly means.

Show code
Humidity_Final <- Final_FW %>%
  select(Date, YEAR, MO, Day, MonthAbb, Humidity, Annual_Humidity) %>%
  na.omit(Humidity)

# When was humidity the highest (Jan of 1998)

HF_highest <- Humidity_Final %>%
  select(Date, Humidity, Annual_Humidity) %>%
  arrange(desc(Humidity)) %>%
  slice(1:12)

kable(HF_highest, digits = 4, align = "ccccccc", col.names = c("Date", "Humidity", "Annual Humidity"), caption = "Highest Humidity Percent In One Day Since 1981") %>%
  kable_styling(font_size = 16) %>%
  row_spec(c(1,1,1))
Table 3: Highest Humidity Percent In One Day Since 1981
Date Humidity Annual Humidity
2017-12-19 97.44 66.6697
1991-02-04 97.00 71.1455
2010-12-28 96.94 66.8041
2014-01-09 96.94 63.9407
2018-02-22 96.88 68.6156
1990-12-28 96.81 67.8432
1993-03-01 96.56 69.5547
1991-01-09 96.50 71.1455
1992-02-11 96.50 72.7452
1985-11-26 96.44 68.1446
2018-02-23 96.44 68.6156
1982-01-21 96.38 67.9098
Show code
# When was humidity the lowest (Aug of 2000)

HF_lowest <- Humidity_Final %>%
  select(Date, Humidity, Annual_Humidity) %>%
  arrange(Humidity) %>%
  na.omit(Humidity) %>%
  slice(1:12)

kable(HF_lowest, digits = 4, align = "ccccccc", col.names = c("Date", "Humidity", "Annual Humidity"), caption = "Lowest Humidity Percent In One Day Since 1981") %>%
  kable_styling(font_size = 16) %>%
  row_spec(c(1,1,1))
Table 3: Lowest Humidity Percent In One Day Since 1981
Date Humidity Annual Humidity
2015-10-14 20.00 71.3814
2011-09-07 21.25 56.4738
2000-09-04 21.81 63.0193
2000-09-18 22.25 63.0193
2011-09-09 22.38 56.4738
2011-09-08 22.56 56.4738
2015-10-13 23.00 71.3814
1999-08-17 23.06 61.3826
1984-05-10 23.31 62.4519
2014-05-04 23.38 63.9407
2011-09-13 23.56 56.4738
1999-08-20 23.62 61.3826
Show code
HF_stats <- Humidity_Final %>%
  select(Date, Humidity, Annual_Humidity) %>%
  mutate(Mean = mean(Humidity)) %>%
  mutate(Standard_Deviation = sd(Humidity)) %>%
  mutate(Median = median(Humidity)) %>%
  slice(1:12)

kable(HF_stats, digits = 4, align = "ccccccc", col.names = c("Date", "Humdity", "Annual Humdity", "Mean", "Standard Deviation", "Median"), caption = "Statistical Humidity Data") %>%
  kable_styling(font_size = 16) %>%
  row_spec(c(1,1,1,1,1,1))
Table 3: Statistical Humidity Data
Date Humdity Annual Humdity Mean Standard Deviation Median
1981-01-01 66.00 68.8339 66.8091 15.0092 67.88
1981-01-02 64.69 68.8339 66.8091 15.0092 67.88
1981-01-03 72.94 68.8339 66.8091 15.0092 67.88
1981-01-04 63.31 68.8339 66.8091 15.0092 67.88
1981-01-05 65.88 68.8339 66.8091 15.0092 67.88
1981-01-06 77.06 68.8339 66.8091 15.0092 67.88
1981-01-07 69.62 68.8339 66.8091 15.0092 67.88
1981-01-08 75.69 68.8339 66.8091 15.0092 67.88
1981-01-09 83.50 68.8339 66.8091 15.0092 67.88
1981-01-10 74.88 68.8339 66.8091 15.0092 67.88
1981-01-11 67.56 68.8339 66.8091 15.0092 67.88
1981-01-12 62.75 68.8339 66.8091 15.0092 67.88

Percent change

Using both the continuous and year specific percent change function, data analysis was completed and allowed for visualizations to begin.

Show code
cor(Final_FW$Annual_Humidity,Final_FW$Annual_Precipitation)
[1] 0.7680025
Show code
HF <- Humidity_Final %>%
  select(YEAR, MO, Annual_Humidity, Humidity)

HF <- HF %>%
  distinct(YEAR, Annual_Humidity)
HF <- HF %>%
  slice(which(row_number() %% 3 == 1))
HYearOneprep <- HF[1,c("Annual_Humidity")]

HPct_change <- HF %>%
  dplyr::mutate(hPrevious = lag(Annual_Humidity),
                HNext = lead(Annual_Humidity),
                Hchange = Annual_Humidity - hPrevious,
                HPercent = (Hchange/hPrevious)* 100,
                HPercent_change = (Hchange/lag(Annual_Humidity) -1) * 100,
                Hum_Change_from_year_one = (Annual_Humidity/HYearOneprep - 1) * 100)


HPct_change <- HPct_change %>%
  select(YEAR, Annual_Humidity, HPercent, Hum_Change_from_year_one)

Visualizing Temperature

Visualizing temperature offered a lot of flexibility in terms of design. A graphic design that I favored was the density ridge graphs. Here, we can see the range of temperature for each month and the breakdown of one annual year in Texas. The colors are coordinated to match the colder or warmer temperatures presented throughout the year.

Show code
Temperature_Final %>%
ggplot(mapping = aes(x = Temperature, y = MO, group = MO, fill = ..x..)) +
geom_density_ridges_gradient(scale = 3, rel_min_height = 0.01,
alpha = 5) +
scale_fill_viridis(name = "Temp. [F]", option = "C") +
theme_fivethirtyeight(base_size = 10, base_family = 'serif') +
  theme(axis.title = element_text(family = 'serif', size = 15)) + ylab('Months') + xlab('Temperature [F]') +
  labs(title = "Mean Temperature Range for Each Month", caption = "")

Below is the percent changed function based on the comparison from 1981 to each subsequent year. The major conclusion to be made when analyzing this section is the difference between 1981 and 2022. There is an increase of 1.78% which shows a change within the climate; however, this is not a significant amount of change in the broader understanding of climate.

Show code
ggplot(data = Temp_change) +
  geom_col(aes(x = YEAR, y = Percent_temp, fill = Percent_temp > 0), alpha = .2) +
  geom_text(aes(x = YEAR, y = Percent_temp, label = paste0(round(Percent_temp,2), "%")),size = 3, vjust = -.5) +
   theme_fivethirtyeight(base_size = 10, base_family = 'serif') +
  theme(axis.title = element_text(family = 'serif', size = 15)) + ylab('Percent Changed') + xlab('Years') +
  labs(title = "Temperature Percent Change Over 40 Years [%]", caption = "3 Year time gaps")
Show code
ggplot(data = Temp_change) +
  geom_col(aes(x = YEAR, y = TChange_from_year_one, fill = TChange_from_year_one > 0), alpha = .2) +
  geom_text(aes(x = YEAR, y = TChange_from_year_one, label =  paste0(round(TChange_from_year_one,2), "%")),size = 3, vjust = -.5) +
  theme_fivethirtyeight(base_size = 10, base_family = 'serif') +
  theme(axis.title = element_text(family = 'serif', size = 15)) + ylab('Percent Changed') + xlab('Years') +
  labs(title = "Temperature Percent Change From 1981 [%]", caption = "3 Year time gaps")

Visualizing Precipitation

The percent change difference from 1981 to each subsequent year is visualized above. We notice that there has been a major decrease in precipitation since 1981 excluding 1990. To the modern-day, a 13.48% decrease has occurred resulting in a major shift in precipitation. Based on the mean precipitation the annual amount would be 34.92 inches. Utilizing this number, the -13.48% would decrease precipitation by 4.7 inches annually. This number displays a major shift in precipitation and could cause a decrease in vegetation or wildlife within the local environment. This is one major discovery that was unexpected as recent events within the state have suggested the opposite of this statistic. The most recent decade appears to be progressing towards a positive number, future analysis will be able to confirm if there is an increase in precipitation coming.

Show code
ggplot(data = Pct_change) +
  geom_col(aes(x = YEAR, y = Percent, fill = Percent > 0), alpha = .2) +
  theme_classic() +
  geom_text(aes(x = YEAR, y = Percent, label = paste0(round(Percent,2), "%")),size = 3, vjust = -.5) +
      theme_fivethirtyeight(base_size = 10, base_family = 'serif') +
  theme(axis.title = element_text(family = 'serif', size = 15)) + ylab('Percent Changed') + xlab('Years') +
  labs(title = "Percipitation Percent Change Over 40 Years [%]", caption = "3 Year time gaps")
Show code
ggplot(data = Pct_change) +
 geom_col(aes(x = YEAR, y = Change_from_year_one, fill = Change_from_year_one > 0), alpha = .2) +
  theme_classic() +
  geom_text(aes(x = YEAR, y = Change_from_year_one, label = paste0(round(Change_from_year_one,2), "%")),size = 3, vjust = -.5) +
      theme_fivethirtyeight(base_size = 10, base_family = 'serif') +
  theme(axis.title = element_text(family = 'serif', size = 15)) + ylab('Percent Changed') + xlab('Years') +
  labs(title = "Percipitation Percent Change From 1981 [%]", caption = "3 Year time gaps")

Visualizing Humidity

Humidity percent change compared to the original year has followed a similar suit to precipitation which is within the range of expectation as the correlation between these two variables is high. However, one interesting fact is that when looking at the modern-day number we notice only a slight decrease compared to 1981. There seems to be an increase in water vapor over the last decade indicating that within the next few years there may be a larger amount of humidity.

Show code
ggplot(data = Humidity_Final, mapping = aes(x = Humidity)) +
  geom_bar(mapping = aes(fill = MonthAbb), width = .5) +
    guides(
    color = guide_colorbar(
      nrow = 1,
      override.aes = list(size = 4)
    )
  ) +
  theme_fivethirtyeight(base_size = 10, base_family = 'serif') +
  theme(axis.title = element_text(family = 'serif', size = 15)) + ylab('Occurances') + xlab('Humidity [%]') +
  labs(title = "Humidity Percent Occurance", caption = "")

The visualization of percent humidity by month expresses that April has the most days of humidity until the 90% point. However, as the percentage grows beyond the 90% point, April demonstrates less frequent occurrences of humidity in comparison to the winter months (December, January, and February). This observation was made prior in the study when the highest humidity point occurred in January. This betrayed initial expectations; however, the observation displayed that the most frequent observation is not always the one with the highest range. When discussing humidity, the general expectation is that the spring months bring the most humidity due to high levels of precipitation. This graph demonstrates that consistent results can sway an individual’s perception of range. To draw these observations a visualization was generated utilizing the humidity by day. These values were color-coordinated by month to enhance the visual consumption of information. During the trials of this analysis, the values were too squished together resulting in an extreme overlap of colors. This greatly reduced the information output of the visualization; however, this was overcome by applying “width = .5”. Width of .5 allowed enough spacing to retain high levels of observation while keeping the values in their original compacted form.

Show code
ggplot(data = HPct_change) +
  geom_col(aes(x = YEAR, y = HPercent, fill = HPercent > 0), alpha = .2) +
  geom_text(aes(x = YEAR, y = HPercent, label = paste0(round(HPercent,2), "%")),size = 3, vjust = -.5) +
    theme_fivethirtyeight(base_size = 10, base_family = 'serif') +
  theme(axis.title = element_text(family = 'serif', size = 15)) + ylab('Percent Changed') + xlab('Years') +
  labs(title = "Humidity Percent Change From 1981 [%]", caption = "3 Year time gaps")
Show code
ggplot(data = HPct_change) +
  geom_col(aes(x = YEAR, y = Hum_Change_from_year_one, fill = Hum_Change_from_year_one > 0), alpha = .2) +
  geom_text(aes(x = YEAR, y = Hum_Change_from_year_one, label = paste0(round(Hum_Change_from_year_one,2), "%")),size = 3, vjust = -.5) +
  theme_fivethirtyeight(base_size = 10, base_family = 'serif') +
  theme(axis.title = element_text(family = 'serif', size = 15)) + ylab('Percent Changed') + xlab('Years') +
  labs(title = "Humidity Percent Change Since 1981 [%]", caption = "3 Year time gaps")

Reflection

Upon reflection of this project, an exuberant amount of information was absorbed during this analytical process. Being new to R and analytics, a significant amount of study was applied to learn the analytical process as well as the functions in play. There was a steep learning period presented by unique problems requiring out of box thinking. Learning through trial and error was extremely useful as it forced me to be hands-on with the information and think through each situation. The general workflow for this process is as follows: run initial code, encounter an error, review error and code, if fixable, then changes were applied, if not, then search for answers, apply those changes, and test. This taught me to think through a problem and identify the major changes required. This adaptability taught me critical professional skills I expect to apply to my future career. For this project, I wanted to approach the question of climate change on a micro level within the Fort Worth area of Texas. The urge stemmed from recent abnormal local weather occurrences. To note a few, in 2015 a major drought plagued the local area resulting in an arid environment. In continuation, a season of high precipitation occurred resulting in flooding and damage to properties. Additionally, the winter storm of 2021 which shut the majority of Texas down for 5 days inspired me to investigate my own community closer. This project was extremely helpful in relation to my growth as an analyst as it forced me to handle the entire analytical process while also allowing me to explore my interests.

My approach was simplistic: analyze the difference between three variables at two different points in time. After importing the original data set, I noticed there was a problem with the precipitation since no more than 3 inches was showing annually. I imported more data to overcome this first problem and was swiftly met with another. Data was tucked away and spread out which resulted in me having to do a double pivot to respread that data out. In retrospect, I would like to investigate other methods which may prove more efficient in eliminating data complications. The most difficult problem I encountered was when the precipitation percent function kept converting my percent change from year one to a data.table data type. I attempted to convert the data back into a num or int. However, it would not accept that and only presented the data as a data.table data type. The function worked normally with the other variables despite them having the same data types. Yet, it would not accept the precipitation data frame. To overcome this issue, I ran the equation in a more manual way by taking the value from 1981 and dividing it by 2022, and then multiplying that value by 100 to convert it to a percentage. Though this corrected the issue, I have yet to conclude what created the issue with my data type.

Data analysis is ongoing for this data set. Future analysis would include the other values such as wind speed and surface pressure to enable a deeper statistical evaluation. Additionally, I would like to import a dataset that dates back further to increase the accuracy of my information and conclusion. These tools would permit me to create a continuously updating system that provides more accurate information with each passing day for Fort Worth, Texas.

Conclusion: Has the climate changed?

Based on the results of this analysis, there has not been a major shift in climate within Fort Worth, Texas. The reason for these claims is the lack of significant increase or decrease over time compared to 1981. There were years where a major shift in the variable would occur; however, that was corrected out by the following year and can be attributed to expected variations within the Fort Worth subtropical climate. Some interesting takeaways from this project were that both the precipitation and humidity graphs were indicating a trend of growth. Each of these variables were negative compared to 1981. However, they increased rapidly over the last decade indicating that there could be a possible increase in both precipitation and humidity. Temperature proved to be inconsistent and happened to be on the higher side during this current segment; this is expected to decrease once more. There were some questions that remain to be answered from this analysis. A major question that remains is the lack of wind and surface pressure and what effects those variables would play in determining the overarching shift in climate change. The second question is more of a limiting factor which is the lack of time incorporated into this study. Roughly 40 years were dated within this analysis; however, more historical data would have highlighted major shifts in variables. With these major shifts, we could have located more trends or patterns within this study. The third and final aspect that could have provided a more in-depth analysis is more variance in statistical data. Most statistics used within this study were descriptive revolving around the central tendency. However, incorporating Bayesian statistics would have enabled a more precise analysis. These factors highlight the need for further research and analysis. However, with current information and statistics based on this data set, we can say that there is not a major shift in climate change within the last 40 years.

Bibliography

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Campbell (2022, May 11). Data Analytics and Computational Social Science: Fort Worth Climate Change. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomethancampbell900466/

BibTeX citation

@misc{campbell2022fort,
  author = {Campbell, Ethan},
  title = {Data Analytics and Computational Social Science: Fort Worth Climate Change},
  url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomethancampbell900466/},
  year = {2022}
}