Fort Worth climate change based on precipitation, humidity, and temperature from 1981-2020
Importing and viewing the data to determine major cleaning changes that need to be made. Here we notice that all the months are separated and that the PARAMETER column holds all of the unique values. I want to combine the months into one column and then spread the unique values. Once I complete this I can select the columns that I want and then remove the Na values. When I have this setup I will be able to run an analysis on the specific segment that I want.
Fort_Worth <- read.csv("Fort_Worth_climate.csv", skip = 18)
Fort_Worth %>%
slice(1:12) %>%
knitr::kable(caption = "Original Table", digits = 2) %>%
kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE)
PARAMETER | YEAR | JAN | FEB | MAR | APR | MAY | JUN | JUL | AUG | SEP | OCT | NOV | DEC | ANN |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
PS | 1981 | 99.17 | 99.00 | 98.53 | 98.64 | 98.18 | 98.29 | 98.49 | 98.45 | 98.65 | 98.69 | 98.76 | 98.81 | 98.64 |
PS | 1982 | 98.81 | 99.11 | 98.50 | 98.52 | 98.27 | 98.28 | 98.51 | 98.53 | 98.58 | 98.76 | 98.83 | 98.78 | 98.62 |
PS | 1983 | 98.91 | 98.51 | 98.06 | 98.08 | 98.23 | 98.26 | 98.63 | 98.59 | 98.65 | 98.80 | 98.37 | 99.20 | 98.53 |
PS | 1984 | 99.40 | 98.61 | 98.42 | 98.03 | 98.38 | 98.36 | 98.49 | 98.47 | 98.72 | 98.58 | 98.90 | 98.89 | 98.61 |
PS | 1985 | 99.24 | 99.08 | 98.57 | 98.47 | 98.26 | 98.42 | 98.49 | 98.42 | 98.60 | 98.66 | 98.61 | 99.26 | 98.67 |
PS | 1986 | 99.22 | 98.56 | 98.70 | 98.41 | 98.27 | 98.40 | 98.58 | 98.55 | 98.56 | 98.87 | 98.91 | 99.18 | 98.69 |
PS | 1987 | 98.84 | 98.61 | 98.53 | 98.61 | 98.38 | 98.49 | 98.54 | 98.45 | 98.56 | 99.00 | 98.90 | 98.79 | 98.64 |
PS | 1988 | 99.33 | 99.13 | 98.70 | 98.34 | 98.42 | 98.51 | 98.57 | 98.36 | 98.49 | 98.90 | 98.49 | 99.21 | 98.70 |
PS | 1989 | 99.06 | 99.39 | 98.64 | 98.53 | 98.30 | 98.40 | 98.60 | 98.46 | 98.69 | 98.81 | 98.75 | 99.24 | 98.73 |
PS | 1990 | 98.84 | 98.85 | 98.83 | 98.59 | 98.21 | 98.37 | 98.63 | 98.63 | 98.63 | 98.79 | 98.89 | 98.99 | 98.69 |
PS | 1991 | 99.10 | 99.03 | 98.23 | 98.19 | 98.32 | 98.42 | 98.56 | 98.64 | 98.85 | 98.61 | 99.07 | 99.16 | 98.68 |
PS | 1992 | 99.02 | 98.70 | 98.55 | 98.50 | 98.61 | 98.08 | 98.48 | 98.74 | 98.61 | 98.70 | 98.69 | 98.92 | 98.63 |
Month_combined <- Fort_Worth %>%
pivot_longer(
cols = c(NOV, JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, DEC),
names_to = "MONTH",
values_to = "Month_AVG",
)
Month_combined %>%
select("PARAMETER", "YEAR", "MONTH", "Month_AVG", "ANN") %>%
slice(1:12) %>%
knitr::kable(caption = "Month Combined", digits = 2) %>%
kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE)
PARAMETER | YEAR | MONTH | Month_AVG | ANN |
---|---|---|---|---|
PS | 1981 | NOV | 98.76 | 98.64 |
PS | 1981 | JAN | 99.17 | 98.64 |
PS | 1981 | FEB | 99.00 | 98.64 |
PS | 1981 | MAR | 98.53 | 98.64 |
PS | 1981 | APR | 98.64 | 98.64 |
PS | 1981 | MAY | 98.18 | 98.64 |
PS | 1981 | JUN | 98.29 | 98.64 |
PS | 1981 | JUL | 98.49 | 98.64 |
PS | 1981 | AUG | 98.45 | 98.64 |
PS | 1981 | SEP | 98.65 | 98.64 |
PS | 1981 | OCT | 98.69 | 98.64 |
PS | 1981 | DEC | 98.81 | 98.64 |
Para_split <- Month_combined %>%
pivot_wider(names_from = PARAMETER,
values_from = Month_AVG,
)
Para_split %>%
slice(1:12) %>%
knitr::kable(caption = "PARAMETER split", digits = 2) %>%
kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE)
YEAR | ANN | MONTH | PS | TS | T2M | QV2M | RH2M | WD50M | WS10M | WS50M | PRECTOTCORR | PRECTOTCORR_SUM |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1981 | 98.64 | NOV | 98.76 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
1981 | 98.64 | JAN | 99.17 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
1981 | 98.64 | FEB | 99.00 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
1981 | 98.64 | MAR | 98.53 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
1981 | 98.64 | APR | 98.64 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
1981 | 98.64 | MAY | 98.18 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
1981 | 98.64 | JUN | 98.29 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
1981 | 98.64 | JUL | 98.49 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
1981 | 98.64 | AUG | 98.45 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
1981 | 98.64 | SEP | 98.65 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
1981 | 98.64 | OCT | 98.69 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
1981 | 98.64 | DEC | 98.81 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
# Rename in the temperature column and then mutating to change from C to F and then rearranged in the order i want while removed the Na values.
Final_Temperature <- Para_split %>%
rename(Temperature = T2M) %>%
mutate(Temperature_F = Temperature * 9/5 + 32) %>%
mutate(Annual_Temperature = ANN * 9/5 + 32) %>%
select(YEAR, MONTH, Temperature_F, Annual_Temperature) %>%
na.omit(Temperature_F)
Final_Temperature %>%
slice(1:12) %>%
knitr::kable(caption = "Temperature", digits = 2) %>%
kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE)
YEAR | MONTH | Temperature_F | Annual_Temperature |
---|---|---|---|
1981 | NOV | 53.24 | 64.45 |
1981 | JAN | 42.19 | 64.45 |
1981 | FEB | 47.93 | 64.45 |
1981 | MAR | 55.02 | 64.45 |
1981 | APR | 69.01 | 64.45 |
1981 | MAY | 70.27 | 64.45 |
1981 | JUN | 79.36 | 64.45 |
1981 | JUL | 86.67 | 64.45 |
1981 | AUG | 85.33 | 64.45 |
1981 | SEP | 76.17 | 64.45 |
1981 | OCT | 64.42 | 64.45 |
1981 | DEC | 42.94 | 64.45 |
# When was Temperature the highest (Jul of 2011)
Final_Temperature %>%
select(YEAR, MONTH, Temperature_F, Annual_Temperature) %>%
arrange(desc(Temperature_F)) %>%
slice(1:12) %>%
knitr::kable(caption = "Highest Temerpature", digits = 2) %>%
kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE)
YEAR | MONTH | Temperature_F | Annual_Temperature |
---|---|---|---|
2011 | JUL | 94.23 | 67.51 |
2011 | AUG | 94.17 | 67.51 |
1998 | JUL | 93.24 | 67.17 |
1999 | AUG | 92.17 | 66.83 |
2000 | AUG | 91.87 | 65.71 |
1985 | AUG | 90.95 | 64.00 |
2006 | AUG | 90.34 | 67.37 |
2018 | JUL | 90.21 | 65.03 |
2001 | JUL | 90.05 | 64.78 |
1988 | AUG | 89.56 | 64.27 |
1993 | AUG | 89.56 | 63.05 |
2010 | AUG | 89.51 | 64.40 |
# When was Temperature the lowest (Dec of 1983)
Final_Temperature %>%
select(YEAR, MONTH, Temperature_F, Annual_Temperature) %>%
arrange(Temperature_F) %>%
na.omit(Temperature_F) %>%
slice(1:12) %>%
knitr::kable(caption = "Lowest Temperature", digits = 2) %>%
kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE)
YEAR | MONTH | Temperature_F | Annual_Temperature |
---|---|---|---|
1983 | DEC | 33.49 | 62.73 |
1985 | JAN | 34.29 | 64.00 |
1989 | DEC | 35.56 | 62.24 |
2000 | DEC | 35.69 | 65.71 |
1984 | JAN | 36.12 | 64.80 |
1988 | JAN | 37.26 | 64.27 |
2007 | JAN | 38.26 | 63.75 |
2009 | DEC | 38.68 | 64.51 |
1991 | JAN | 38.75 | 63.90 |
2010 | FEB | 39.04 | 64.40 |
2011 | JAN | 39.15 | 67.51 |
2001 | JAN | 39.22 | 64.78 |
Final_Temperature %>%
select(YEAR, MONTH, Temperature_F, Annual_Temperature) %>%
filter(YEAR < 1982) %>%
mutate(Mean = mean(Temperature_F)) %>%
mutate(Standard_Deviation = sd(Temperature_F)) %>%
mutate(Median = median(Temperature_F)) %>%
slice(1:12) %>%
knitr::kable(caption = "Temperature stats data", digits = 2) %>%
kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE)
YEAR | MONTH | Temperature_F | Annual_Temperature | Mean | Standard_Deviation | Median |
---|---|---|---|---|---|---|
1981 | NOV | 53.24 | 64.45 | 64.38 | 15.92 | 66.71 |
1981 | JAN | 42.19 | 64.45 | 64.38 | 15.92 | 66.71 |
1981 | FEB | 47.93 | 64.45 | 64.38 | 15.92 | 66.71 |
1981 | MAR | 55.02 | 64.45 | 64.38 | 15.92 | 66.71 |
1981 | APR | 69.01 | 64.45 | 64.38 | 15.92 | 66.71 |
1981 | MAY | 70.27 | 64.45 | 64.38 | 15.92 | 66.71 |
1981 | JUN | 79.36 | 64.45 | 64.38 | 15.92 | 66.71 |
1981 | JUL | 86.67 | 64.45 | 64.38 | 15.92 | 66.71 |
1981 | AUG | 85.33 | 64.45 | 64.38 | 15.92 | 66.71 |
1981 | SEP | 76.17 | 64.45 | 64.38 | 15.92 | 66.71 |
1981 | OCT | 64.42 | 64.45 | 64.38 | 15.92 | 66.71 |
1981 | DEC | 42.94 | 64.45 | 64.38 | 15.92 | 66.71 |
Temperature_DEC <- Final_Temperature %>%
select(YEAR, MONTH, Temperature_F, Annual_Temperature) %>%
filter(MONTH == "DEC")
Temperature_July <- Final_Temperature %>%
select(YEAR, MONTH, Temperature_F, Annual_Temperature) %>%
filter(MONTH == "JUL")
Final_Precipitation <- Para_split %>%
rename(Precipitation = PRECTOTCORR_SUM) %>%
mutate(Precipitation_annual = ANN / 25.4) %>%
mutate(Precipitation_Monthly = Precipitation / 25.4) %>%
select(YEAR, MONTH, Precipitation_Monthly, Precipitation_annual) %>%
na.omit(Precipitation_Monthly)
Combined_data <- merge(Final_Temperature, Final_Precipitation, by ="YEAR")
Temp_combine <- Final_Temperature %>%
select(Temperature_F, Annual_Temperature)
Combined_data <- cbind(Final_Precipitation, Temp_combine)
Combined_data %>%
select(YEAR, MONTH, Temperature_F,Annual_Temperature, Precipitation_Monthly, Precipitation_annual) %>%
filter(Temperature_F < 39, Precipitation_Monthly > .1) %>%
slice(1:12) %>%
knitr::kable(caption = "Potential Months with Snow days", digits = 2) %>%
kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE)
YEAR | MONTH | Temperature_F | Annual_Temperature | Precipitation_Monthly | Precipitation_annual |
---|---|---|---|---|---|
1983 | DEC | 33.49 | 62.73 | 0.66 | 26.99 |
1984 | JAN | 36.12 | 64.80 | 0.90 | 32.34 |
1985 | JAN | 34.29 | 64.00 | 0.71 | 32.93 |
1988 | JAN | 37.26 | 64.27 | 0.51 | 25.78 |
1989 | DEC | 35.56 | 62.24 | 0.31 | 43.01 |
1991 | JAN | 38.75 | 63.90 | 2.88 | 47.55 |
2000 | DEC | 35.69 | 65.71 | 2.58 | 32.05 |
2007 | JAN | 38.26 | 63.75 | 2.81 | 44.77 |
2009 | DEC | 38.68 | 64.51 | 1.75 | 38.44 |
Final_Temperature %>%
ggplot(aes(x = Annual_Temperature)) +
geom_density(aes(fill = "blue"),
show.legend = F,
alpha = .5) +
labs(title = "Density plot",
x = "Annual temperature [f]",
y = "Probability")
Final_Temperature %>%
filter(Annual_Temperature > 50) %>%
ggplot(aes(x= YEAR,
y = Annual_Temperature,
size = Annual_Temperature,
color = YEAR)) +
geom_point() +
geom_smooth() +
labs(title = "Temperature change over 40 years",
x = "Year",
y = "Annual Temperature")
Temperature_DEC %>%
ggplot(aes(x= YEAR,
y = Temperature_F,
size = Temperature_F,
color = YEAR)) +
geom_point() +
geom_smooth() +
labs(title = "December Temperature change over 40 years",
x = "Year",
y = "December Temperature")
Temperature_July %>%
ggplot(aes(x= YEAR,
y = Temperature_F,
size = Temperature_F,
color = YEAR)) +
geom_point() +
geom_smooth() +
labs(title = "July Temperature change over 40 years",
x = "Year",
y = "July Temperature")
Final_Temperature %>%
drop_na(Temperature_F) %>%
filter(Temperature_F > 50) %>%
ggplot(aes(Temperature_F, fill = MONTH)) +
geom_density(alpha = 0.5) +
facet_wrap(~MONTH) +
labs(title = "Density plot of a temperature greater than 50",
subtitle = "Decemeber does not show due to not reaching 50",
x = "Temperature",
y = "Probability") +
theme(legend.position = "none")
Final_Temperature %>%
drop_na(Temperature_F) %>%
filter(Temperature_F < 49) %>%
ggplot(aes(Temperature_F, fill = MONTH)) +
geom_density(alpha = 0.5) +
facet_wrap(~MONTH) +
labs(title = "Density plot of a temperature less than 49",
subtitle = "Only these months are below 49",
x = "Temperature",
y = "Probability") +
theme(legend.position = "none")
Final_Temperature %>%
ggplot(mapping = aes(x = Temperature_F , y = MONTH, fill = ..x..)) +
geom_density_ridges_gradient(scale = 3, rel_min_height = 0.01,
alpha = 5) +
scale_fill_viridis(name = "Temp. [F]", option = "C") +
labs(title = 'Temperatures in Fort Worth') +
theme_bw() +
theme(legend.position="none",
panel.spacing = unit(0.1, "lines"),
strip.text.x = element_text(size = 8))
I will start with temperature and I will need the columns YEAR, MONTH, ANN, and T2m. This will show the year that it occurred and the month. The t2m will show the temperature at 2meters above the surface level and will display it per month. The Ann will show the average for the year and this is good to use when looking at each year to get a comparison. The temperature column has NA values and is in C format so we need to rename the column and remove the Na values while switching it to F format.
Confirming the annual column to ensure it is the mean. We notice a slight difference between the annual column and the mean. This number falls in between the mean and the median and the sd is 16.5 showing a high range of variance in the values. This is a large range however, I believe this is due to the fact that we are comparing separate months with increasing and decreasing values of temperature. Naturally, the SD would be wide to accompany the wide range of temperatures.
# renaming the columns
Final_Humidity <- Para_split %>%
rename(Humidity = RH2M) %>%
rename(Annual_Humidity_percent = ANN) %>%
select(YEAR, MONTH, Humidity, Annual_Humidity_percent) %>%
na.omit(Humidity)
Final_Humidity %>%
slice(1:12) %>%
knitr::kable(caption = "Humidity", digits = 2) %>%
kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE)
YEAR | MONTH | Humidity | Annual_Humidity_percent |
---|---|---|---|
1981 | NOV | 80.19 | 68.81 |
1981 | JAN | 69.38 | 68.81 |
1981 | FEB | 67.19 | 68.81 |
1981 | MAR | 67.38 | 68.81 |
1981 | APR | 66.19 | 68.81 |
1981 | MAY | 71.12 | 68.81 |
1981 | JUN | 75.50 | 68.81 |
1981 | JUL | 57.81 | 68.81 |
1981 | AUG | 50.06 | 68.81 |
1981 | SEP | 63.50 | 68.81 |
1981 | OCT | 79.62 | 68.81 |
1981 | DEC | 78.31 | 68.81 |
# When was humidity the highest (Jan of 1998)
Final_Humidity %>%
select(YEAR, MONTH, Humidity, Annual_Humidity_percent) %>%
arrange(desc(Humidity)) %>%
slice(1:12) %>%
knitr::kable(caption = "Highest Humidty", digits = 2) %>%
kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE)
YEAR | MONTH | Humidity | Annual_Humidity_percent |
---|---|---|---|
1998 | JAN | 86.56 | 66.31 |
1991 | DEC | 86.50 | 71.12 |
1994 | DEC | 85.69 | 70.50 |
1984 | DEC | 85.62 | 62.44 |
2015 | MAY | 84.88 | 71.38 |
1992 | JAN | 84.75 | 72.75 |
2018 | OCT | 84.69 | 68.62 |
1986 | DEC | 83.62 | 69.62 |
1993 | JAN | 83.56 | 69.56 |
2001 | JAN | 83.38 | 69.50 |
2001 | FEB | 83.38 | 69.50 |
1992 | FEB | 83.25 | 72.75 |
# When was humidity the lowest (Aug of 2000)
Final_Humidity %>%
select(YEAR, MONTH, Humidity, Annual_Humidity_percent) %>%
arrange(Humidity) %>%
na.omit(Humidity) %>%
slice(1:12) %>%
knitr::kable(caption = "Lowest Humidity", digits = 2) %>%
kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE)
YEAR | MONTH | Humidity | Annual_Humidity_percent |
---|---|---|---|
2000 | AUG | 34.88 | 63.00 |
1999 | AUG | 35.81 | 61.38 |
2011 | JUL | 36.19 | 56.44 |
2011 | AUG | 37.31 | 56.44 |
1985 | AUG | 37.62 | 68.12 |
2000 | SEP | 40.44 | 63.00 |
2011 | SEP | 41.06 | 56.44 |
1998 | JUL | 41.25 | 66.31 |
1984 | JUL | 41.31 | 62.44 |
1988 | AUG | 44.06 | 61.31 |
2011 | JUN | 44.81 | 56.44 |
1993 | AUG | 45.00 | 69.56 |
Final_Humidity %>%
select(YEAR, MONTH, Humidity, Annual_Humidity_percent) %>%
filter(YEAR < 1983, YEAR > 1981) %>%
mutate(Mean = mean(Humidity)) %>%
mutate(Standard_Deviation = sd(Humidity)) %>%
mutate(Median = median(Humidity)) %>%
slice(1:12) %>%
knitr::kable(caption = "Humidty stats data", digits = 2) %>%
kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE)
YEAR | MONTH | Humidity | Annual_Humidity_percent | Mean | Standard_Deviation | Median |
---|---|---|---|---|---|---|
1982 | NOV | 70.38 | 67.88 | 67.97 | 10.97 | 70.28 |
1982 | JAN | 70.19 | 67.88 | 67.97 | 10.97 | 70.28 |
1982 | FEB | 78.69 | 67.88 | 67.97 | 10.97 | 70.28 |
1982 | MAR | 73.94 | 67.88 | 67.97 | 10.97 | 70.28 |
1982 | APR | 66.81 | 67.88 | 67.97 | 10.97 | 70.28 |
1982 | MAY | 79.69 | 67.88 | 67.97 | 10.97 | 70.28 |
1982 | JUN | 77.81 | 67.88 | 67.97 | 10.97 | 70.28 |
1982 | JUL | 65.75 | 67.88 | 67.97 | 10.97 | 70.28 |
1982 | AUG | 49.00 | 67.88 | 67.97 | 10.97 | 70.28 |
1982 | SEP | 48.31 | 67.88 | 67.97 | 10.97 | 70.28 |
1982 | OCT | 58.19 | 67.88 | 67.97 | 10.97 | 70.28 |
1982 | DEC | 76.88 | 67.88 | 67.97 | 10.97 | 70.28 |
# change in humidity over years by each month
ggplot(data = Final_Humidity, mapping = aes(x = YEAR, y = Humidity)) +
geom_point() +
geom_smooth(mapping = aes(color = MONTH), se = FALSE)
# Facet wrap of the previous graph to separate them
ggplot(data = Final_Humidity, mapping = aes(x = YEAR, y = Humidity)) +
geom_point() +
geom_smooth(mapping = aes(color = MONTH), se = FALSE) +
facet_wrap(~ MONTH, nrow = 5)
Final_Humidity %>%
filter(Annual_Humidity_percent > 60) %>%
ggplot(aes(x= YEAR,
y = Annual_Humidity_percent,
size = Annual_Humidity_percent,
color = YEAR)) +
geom_point() +
geom_smooth() +
labs(title = "Humidity change over 40 years",
x = "Year",
y = "Annual Humidity")
Here we are doing the same thing that we did to the previous section of temperature. We renamed the RH2m to humidity and then changed the ANN to annual_humidity_percent. Next we removed the na values to focus in on the information that we needed and then graphed it to start drawing conclusions.
Once more testing the mean, median, and sd of one year. Once more we notice a slight difference between the mean and the annual however, the difference is .11 which is not major. The standard deviation is almost 11 which could once again be explained by the rainy seasons that come and the increase in humidity that follows them however, in this one we notice it is a much a smaller gap compared to the temperature.
# changing the precipitation from mm to inches and then renaming.
Final_Precipitation <- Para_split %>%
rename(Precipitation = PRECTOTCORR_SUM) %>%
mutate(Precipitation_annual = ANN / 25.4) %>%
mutate(Precipitation_Monthly = Precipitation / 25.4) %>%
select(YEAR, MONTH, Precipitation_Monthly, Precipitation_annual) %>%
na.omit(Precipitation_Monthly)
Final_Precipitation %>%
slice(1:12) %>%
knitr::kable(caption = "Precipitation (Inches)", digits = 2) %>%
kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE)
YEAR | MONTH | Precipitation_Monthly | Precipitation_annual |
---|---|---|---|
1981 | NOV | 1.53 | 42.61 |
1981 | JAN | 0.39 | 42.61 |
1981 | FEB | 1.82 | 42.61 |
1981 | MAR | 3.33 | 42.61 |
1981 | APR | 3.04 | 42.61 |
1981 | MAY | 6.08 | 42.61 |
1981 | JUN | 4.05 | 42.61 |
1981 | JUL | 1.42 | 42.61 |
1981 | AUG | 2.12 | 42.61 |
1981 | SEP | 2.98 | 42.61 |
1981 | OCT | 15.63 | 42.61 |
1981 | DEC | 0.21 | 42.61 |
# When was precipitation the highest (OCT of 1981)
Final_Precipitation %>%
select(YEAR, MONTH, Precipitation_Monthly, Precipitation_annual) %>%
arrange(desc(Precipitation_Monthly)) %>%
na.omit(Precipitation_Monthly) %>%
slice(1:12) %>%
knitr::kable(caption = "Highest Precipitation (Inches)", digits = 2) %>%
kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE)
YEAR | MONTH | Precipitation_Monthly | Precipitation_annual |
---|---|---|---|
1981 | OCT | 15.63 | 42.61 |
2015 | MAY | 15.24 | 57.96 |
1982 | MAY | 11.28 | 38.48 |
1989 | MAY | 10.61 | 43.01 |
2018 | OCT | 10.43 | 40.07 |
2004 | JUN | 10.23 | 45.36 |
2007 | JUN | 10.17 | 44.77 |
1989 | JUN | 9.64 | 43.01 |
1990 | APR | 9.36 | 46.70 |
1991 | DEC | 8.75 | 47.55 |
1991 | OCT | 8.74 | 47.55 |
2009 | OCT | 8.69 | 38.44 |
# When was it the lowest (Jan of 1986)
Final_Precipitation %>%
select(YEAR, MONTH, Precipitation_Monthly, Precipitation_annual) %>%
arrange(Precipitation_Monthly) %>%
na.omit(Precipitation_Monthly) %>%
slice(1:12) %>%
knitr::kable(caption = "Lowest Precipitation (Inches)", digits = 2) %>%
kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE)
YEAR | MONTH | Precipitation_Monthly | Precipitation_annual |
---|---|---|---|
1986 | JAN | 0.01 | 39.02 |
2011 | JUL | 0.01 | 22.53 |
2000 | AUG | 0.02 | 32.05 |
2011 | MAR | 0.08 | 22.53 |
1993 | JUL | 0.08 | 36.92 |
2012 | NOV | 0.12 | 28.73 |
2018 | JAN | 0.17 | 40.07 |
2014 | JAN | 0.18 | 23.94 |
2005 | NOV | 0.19 | 18.28 |
1981 | DEC | 0.21 | 42.61 |
2005 | DEC | 0.21 | 18.28 |
1996 | FEB | 0.21 | 33.81 |
Stats_precipitation <- Final_Precipitation %>%
select(YEAR, MONTH, Precipitation_Monthly, Precipitation_annual) %>%
filter(YEAR < 1982) %>%
mutate(Mean = mean(Precipitation_Monthly)) %>%
mutate(Standard_Deviation = sd(Precipitation_Monthly)) %>%
mutate(Median = median(Precipitation_Monthly))
Stats_precipitation %>%
slice(1:12) %>%
knitr::kable(caption = "Precipitation stats data (Inches)", digits = 2) %>%
kableExtra::kable_styling(bootstrap_options = "striped", full_width = TRUE)
YEAR | MONTH | Precipitation_Monthly | Precipitation_annual | Mean | Standard_Deviation | Median |
---|---|---|---|---|---|---|
1981 | NOV | 1.53 | 42.61 | 3.55 | 4.13 | 2.55 |
1981 | JAN | 0.39 | 42.61 | 3.55 | 4.13 | 2.55 |
1981 | FEB | 1.82 | 42.61 | 3.55 | 4.13 | 2.55 |
1981 | MAR | 3.33 | 42.61 | 3.55 | 4.13 | 2.55 |
1981 | APR | 3.04 | 42.61 | 3.55 | 4.13 | 2.55 |
1981 | MAY | 6.08 | 42.61 | 3.55 | 4.13 | 2.55 |
1981 | JUN | 4.05 | 42.61 | 3.55 | 4.13 | 2.55 |
1981 | JUL | 1.42 | 42.61 | 3.55 | 4.13 | 2.55 |
1981 | AUG | 2.12 | 42.61 | 3.55 | 4.13 | 2.55 |
1981 | SEP | 2.98 | 42.61 | 3.55 | 4.13 | 2.55 |
1981 | OCT | 15.63 | 42.61 | 3.55 | 4.13 | 2.55 |
1981 | DEC | 0.21 | 42.61 | 3.55 | 4.13 | 2.55 |
Final_Precipitation %>%
filter(Precipitation_annual > 20) %>%
ggplot(aes(x= YEAR,
y = Precipitation_annual,
size = Precipitation_annual,
color = YEAR)) +
geom_point() +
geom_smooth() +
labs(title = "Precipitation change over 40 years",
x = "Year",
y = "Annual Precipitation")
ggplot(data = Final_Precipitation, mapping = aes(x = YEAR, y = Precipitation_Monthly)) +
geom_smooth(mapping = aes(color = MONTH), se = FALSE)
# Facet grid by month of the precipitation changes over the years
ggplot(data = Final_Precipitation, mapping = aes(x = YEAR, y = Precipitation_Monthly)) +
geom_point() +
geom_smooth(mapping = aes(color = MONTH), se = FALSE) +
facet_wrap(~ MONTH, nrow = 5) +
labs(title = "Each month's change in precipitation over 40 years", x = "Year", y = "Precipitation per month in Inches")
Once more we engaged in a similar process of isolating the information to draw conclusions by renaming and then selecting that information that is important. However, on this one we needed to change the precipitation from mm to inches which required us to divide the ANN and the monthly column by 25.4. During this analysis we aimed towards finding when it was the highest and lowest and learned they were within 5 years of each other.
Here we do not have a mean or average to compare it to since the annual is the combination of the monthly values to show the yearly rainfall however, when dividing that annual combination by 12 we reach the same values which means that number is accurate for annual rain fall. We also notice a much lower sd as the rain seems to be a bit more consistent than the other two values as it is much harder to have a wide range. Texas already does not receive a large amount of rain and that range from 0 to the amount of rain we do get is not very different resulting in this lower sd
Does precipitation increase or decrease over the years and when was it the highest and lowest.
Is there a change in humidity and does it correlate with the change in precipitation, wind, and temperature.
Does the temperature increase over the years or is it decreasing and does it correlate to the other variables?
When considering all variables is there a noticeable change in the climate?
What is missing (if anything) in your analysis process so far?
What conclusions can you make about your research questions at this point?
What do you think a naive reader would need to fully understand your graphs?
Is there anything you want to answer with your dataset, but can’t?
Precipitation has changed over the last 40 years however, it has dropped and then returned to the original amount. The highest was in OCT of 1981 and the lowest it hit was in Jan of 1986
Humidity has changed in the last 40 years and has followed a very similar graph compared to the precipitation. It started at roughly 67% and then took a dip towards the lower 60’s and now has climbed to 70%.There is a correlation between precipitation and humidity however, the temperature has increased since the come of the 21st century.
There is an increase of temperature over the past 40 years, We notice roughly an increase of about 1.5 degrees and a major spike in temperature change between the years of 1990-2000.
When considering all variables we notice a very slight change to the climate. We notice that the temperature and humidity within the area has increased however, the precipitation decreased for 10 years and then returned to the beginning value which lined up with the changes in humidity. However, humidity has surpassed the starting point reaching a new high in the past 40 years.
The thing that I find slightly missing is more statistical data however, I have already incorporated mean, median, and the standard deviation. I will need to research more into how they perform statistical analysis on weather data, possibly look into future weather prediction on whether or not it will continue the same or change.
By this point in my research I am able to see the changes in temperature, humidity, and precipitation over 40 years. I have noticed there are definitely some changes especially in the temperature data. Over the 40 years we have noticed an increase in 1.5 degrees [F]. Humidity and precipitation have followed a similar graphing as they do correlate with each other however, the humidity is slightly different. We notice that they both started at a high point and then dropped during the mid 90’s and then continues downward in the early 21st century. However, when the year 2010 begins we notice a major uptick in both humidity and and precipitation and they correlate to be at the same time.
For a naive reader to understand my graphs they would need to know what measurement is being utilized. Whether that is the inches being used in precipitation or the percentage used in humidity or the [f] used in temperature. The graphs could use more color difference to create a quicker understanding of the graph. I could also try and break down certain weather conditions based on monthly data to enhance the imagery being drawn. For example, when there is rain but the temperature is below 32 degrees it could be labeled into snow or sleet.
Currently, I would love to be able to show which climate it is by combining all the information together. I would love to break things down into subsection like fog, dew, rain, etc… however, to do so I need to combine them all together and then find the correct way to calculate that and currently I am unsure how to appraoch this process.
(“These data were obtained from the NASA Langley Research Center (LaRC) POWER Project funded through the NASA Earth Science/Applied Science Program.”)
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Campbell (2022, March 23). Data Analytics and Computational Social Science: Ethan Campbell HW5. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomethancampbell878504/
BibTeX citation
@misc{campbell2022ethan, author = {Campbell, Ethan}, title = {Data Analytics and Computational Social Science: Ethan Campbell HW5}, url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomethancampbell878504/}, year = {2022} }