601 Final Project
Introduction
Erratic weather may be the new normal for the children who grow up in the world today. In the wake of a worsening climate crisis, likely the most alarming and crucial issue of our generation, we can expect to experience different conditions than we are used to. Climate change has already contributed to an array of cascading effects, including more frequent and extreme storms, heat waves, dry spells, higher sea levels, and warming oceans (WWF). As time goes on, unless we collectively take serious systemic measures to slow the effects, it will continue to wreak havoc on animal habitats and our communities. In this project, I focus on extreme weather patterns, specifically erratic precipitation in the United States.
Climate change tends to produce extremes in the temperature, creating longer, more intense heat spells and more frigid winters. However, changes in precipitation are also common, including blizzards, destructive hurricanes, and droughts.
My first research goal is to determine which states in the US have experienced the greatest extremes in precipitation. I want to know which states have the highest precipitation, lowest precipitation, and greatest range between their highest and lowest measurements in 2021. My second research goal is to find out if the weather patterns for any of these erratic states have changed significantly in the last century. Overall I find that Alaska, Oregon, Washington, and Louisiana had the greatest difference between their highest and lowest recorded precipitation measurements in 2021. I also uncover quantitative evidence to suggest that Louisiana in particular may have developed more extreme weather patterns during the years since 1895, but more research would be necessary to determine if this is a statistically significant trend.
Data
The dataset I will be using contains detailed climate data from the National Centers for Environmental Information. More specifically, it contains information on precipitation in the United States. It can be found at: https://www.ncei.noaa.gov/data/climdiv/ (data set ‘pcpncy’). This particular data set contains records of the total precipitation which fell in each state region over the course of every month from 1895 to 2022. It is updated monthly, and published online for public use.
Tidying the Data
Structuring the data set into a more useful, easily comprehensible form requires significant tidying. In the original data set, the row names are coded. Certain digits of the row names represent the state, metric (temperature or precipitation), and year which are associated with that particular row.
X01001011895 X7.03 X2.96 X8.36 X3.53 X3.96 X5.40 X3.92 X3.36 X0.73
1 1001011896 5.86 5.42 5.54 3.98 3.77 6.24 4.38 2.57 0.82
2 1001011897 3.27 6.63 10.94 4.35 0.81 1.57 3.96 5.02 0.87
3 1001011898 2.33 2.07 2.60 4.56 0.54 3.13 5.80 6.02 1.51
4 1001011899 5.80 6.94 3.35 2.22 2.93 2.31 6.80 2.90 0.63
5 1001011900 3.18 9.07 5.77 7.14 1.63 7.36 3.35 3.85 4.74
6 1001011901 5.20 4.39 6.35 4.61 5.44 2.24 2.79 5.58 3.75
X2.03 X1.44 X3.66
1 1.66 2.89 1.94
2 0.75 1.84 4.38
3 3.21 6.66 3.91
4 3.02 1.98 5.25
5 5.92 4.09 4.89
6 1.01 2.07 7.55
As depicted above, the data set is not very immediately interpretable. According to the code book, the first 1-2 digits of the row names represent a state ID associated with a particular state. The next three digits represent the state division. The next digit represents the metric (“temperature” or “precipitation”), and the last four represent the year. In turn, each of the 12 columns represents a month, and contain the relevant weather measurements for that row. The elements in the data frame are all of type double, because they refers to a precipitation measurement (in inches).
Ultimately, I want the final form of the data set to contain 16 columns. The first 12 should remain the 12 months of the year, but the next columns should be state, division, metric, and year. This format would make the information about each row much more clear. Further analysis of the data set will not require extraction of subsets of the digits of the row names. For instance, if I want to analyze Maryland precipitation in April, I can refer to the categorical column ‘state’ and subset based on the Maryland state ID, as opposed to referring to the cumbersome coded row names.
Below I make some basic tidying adjustments, and then add four new recoded columns to the data set.
#Change the column names to represent the months of the year:
climate2 <- climate
colnames(climate2) = c("a", "January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December")
climate3 <- climate2
library(tidyverse)
climate3 <- climate3 %>% add_row("a" = 01001011895, "January" = 7.03, "February" = 2.96, "March" = 8.36, "April" = 3.53, "May" = 3.96, "June" = 5.40, "July" = 3.92, "August" = 3.36, "September" = 0.73, "October" = 2.03, "November" = 1.44, "December" = 3.66, .before = 1)
#Change the row names:
#(1) Establish the row names.
climate4 <- climate3
rownames(climate4) = climate4$a
#(2) Eliminate the now unnecessary column a.
climate5 <- climate4[,-(1),drop=FALSE]
#Create function that counts the number of digits in a number.
nDigits <- function(x) nchar(trunc(abs(x)))
#In the code below, I discovered that row names have either 10 or 11 digits.
#This is because the state ID's at the beginning have either 1 or 2 digits,
#depending on whether it is single- or double-digit.
sum_10 = 0
sum_11 = 0
for(i in 1:nrow(climate5)){
if (nDigits(as.numeric(rownames(climate5)[i])) != 10){
sum_11 = sum_11 + 1
}
if (nDigits(as.numeric(rownames(climate5)[i])) != 11){
sum_10 = sum_10 + 1
}
}
#Confirm that all of the state ID's are either 1 or 2 digits. sum_10 represents
#number of single-digit state IDs; sum_11 is number of double-digit state ID's.
(sum_11 + sum_10) == nrow(climate5)
[1] TRUE
#Now we have confirmed that the state ID's either have 1 or 2 digits,
#and hence that the row names have either 10 or 11 digits total.
#We need to deal with both of these cases.
#Create 4 columns: state, division, metric, or year based on the codings in the row
#names, and taking into account the cases when the row names are either 10 or 11
#digits.
#Contains the number of digits for each row name.
name_num = nDigits(as.numeric(rownames(climate5)))
climate_try <- climate5 %>%
mutate(state = ifelse(name_num == 11, as.numeric(substr(rownames(climate5), 1,2)), as.numeric(substr(rownames(climate5), 1,1))),
division = ifelse(name_num == 11, as.numeric(substr(rownames(climate5), 3,5)), as.numeric(substr(rownames(climate5), 2,4))),
metric = ifelse(name_num == 11, as.numeric(substr(rownames(climate5), 6,7)), as.numeric(substr(rownames(climate5), 5,6))),
year = ifelse(name_num == 11, as.numeric(substr(rownames(climate5), 8,11)), as.numeric(substr(rownames(climate5), 7,10))))
climate6 <- climate_try
#Add another column to explicitly describe what metric number codings 1, 2, 27, and #28 refer to.
climate7 <- climate6 %>%
mutate(metric_name = case_when(
metric == 1 ~ "Precipitation",
metric == 2 ~ "Average Temperature",
metric == 27 ~ "Maximum Temperature",
metric == 28 ~ "Minimum Temperature"
))
#Rename row names to simply be ID numbers from 1 to 400,636.
climate8 <- climate7
rownames(climate8) = 1:400636
Now the data frame is tidied. The row names are now simply ID numbers from 1 to 400,636. I add columns for state, division, metric, year, and metric name. The first 12 columns of each row contains the precipitation or temperature value from January to December for a specific state, division, and year. The metric column tells us the coded number for the metric being reported in that row. However, it is coded as a number so I include an additional column, metric_name, which tells us whether the metric represented in that row is precipitation, average temperature, maximum temperature, or minimum temperature. I choose to keep the columns state and division coded, which means that it will be useful to have the legend available during further analysis of the data set.
I make some final edits upon further exploration of the data. First, I observe that all the measurements in this particular data set were precipitation measurements, by finding that every categorical value in column 15 (‘metric’) was ‘precipitation’. For this reason, I eliminate the metric and metric_name columns, as they provide no extra information.
I also remove the rows for 2022. This is because upon exploration of the dataset, I noticed that these rows contained the value -9.99 for a lot of their entries. 2022 was the only year for which this happened. There is no explicit mention of this on the website where the data set originated, but my guess is that the data was collected mid-way through 2022, meaning that they left a lot of entries with negative default values.
I perform both of these below.
#Check to see how many non-precipitation measurements there are.
sum6 <- sum(((climate8[,15] != 1)))
#Remove the metric and metric_name columns.
climate9 <- subset(climate8, select = -c(metric, metric_name))
#Remove all rows which are associated with 2022.
climate_new <-subset(climate9, year !="2022")
climate <- climate_new
#Check to see if there are any more rows with negative values.
sum(climate9[1:12,] <0)
[1] 0
Understanding the Data
The first research goal is to determine which state had the highest range of precipitation in 2021, such that the difference between the measurement for its highest-precipitation month and its lowest-precipitation measurement in 2021 was the greatest out of all the states.
To find this value, I took three main steps. First, I found the highest precipitation measurement in all of 2021 for each state. Then, I found the lowest measurement for each state. Finally, I was able to find the range of precipitation by calculating the difference between the highest and lowest precipitation measurements for each state. The state with the largest of these values had the largest range.
Below, I find the highest precipitation measurements.
#FINDING MAXIMUM VALUES:
#Initialize the vector 'highest'.
highest <- rep(0, 12)
#In loop below, we will establish the vector 'highest'. This vector contains 12
#lists, which represent the 12 months in 2021. In each list, there are 49 values for
#the 49 states. Each of the 49 values represent the highest precipitation measurement
#recorded in the state that month, out of all the divisions in the state.
#For instance, the first list in 'highest' is 'January'. The first value in January
#is associated with state #1 (Alabama) and has a measurement of 3.19. This means that
#the maximum precipitation value recorded in Alabama (state #1) during January 2021
#was 3.19.
climate_t <- climate
library(tidyverse)
climate10 <- c()
for(i in 1:12){
climate10 <- climate_t %>%
group_by(state) %>%
arrange(desc(climate_t[,i])) %>%
filter(year == 2021) %>%
slice(1) %>%
ungroup() %>%
select(state, i)
highest[i] <- climate10[,2]
}
#Now that we have found the highest precipitation measurement for each month for
#every state, I want to find the maximum value of the whole year for every state.
#The vector max_val will contain 49 values, each of which contains the highest
#precipitation measurement recorded in one of the 49 states over the course of 2021.
max_val <- c(0, 49)
for (i in 1:49){
max_val[i] = max(highest[[1]][i], highest[[2]][i], highest[[3]][i], highest[[4]][i], highest[[5]][i], highest[[6]][i], highest[[7]][i], highest[[8]][i], highest[[9]][i], highest[[10]][i], highest[[11]][i], highest[[12]][i])
}
The max_val vector contains the highest recorded precipitation measurement for each of the 49 states in the year 2021.
Now I repeat the process for the minimum precipitation measurements.
#FINDING MINIMUM VALUES:
#Initialize the vector 'lowest'.
lowest <- rep(0, 12)
#Below I complete the vector 'lowest', which has the same structure as the vector
#'highest' in the chunk of code above, except that it represents the minimum values
#rather than the maximum values.
climate_t <- climate
library(tidyverse)
climate10 <- c()
for(i in 1:12){
climate10 <- climate_t %>%
group_by(state) %>%
arrange(climate_t[,i]) %>%
filter(year == 2021) %>%
slice(1) %>%
ungroup() %>%
select(state, i)
lowest[i] <- climate10[,2]
}
#Now I create the vector min_val, which is similar in structure to max_val in the
#code chunk above, except now it contains the minimum values.
min_val <- c(0, 49)
for (i in 1:49){
min_val[i] = min(lowest[[1]][i], lowest[[2]][i], lowest[[3]][i], lowest[[4]][i], lowest[[5]][i], lowest[[6]][i], lowest[[7]][i], lowest[[8]][i], lowest[[9]][i], lowest[[10]][i], lowest[[11]][i], lowest[[12]][i])
}
The min_val vector contains the lowest recorded precipitation measurement for each of the 49 states in the year 2021.
It is interesting to note a few features. For one, according to the data set, Alaska (state #49) has the highest precipitation measurement out of all the states in the US in 2021, at 28.78 inches. This is followed by Washington (27.09 in), Oregon (20.89), and Louisiana (20.16).
#Highest precipitation measurement in 2021:
max(max_val)
[1] 28.78
#State ID associated with highest precipitation measurement:
which.max(max_val)
[1] 49
#Second highest precipitation measurement - Washington:
sort(max_val,partial=48)[48]
[1] 27.09
#Third highest precipitation measurement - Oregon:
sort(max_val,partial=47)[47]
[1] 20.89
#Fourth highest precipitation measurement - Louisiana:
sort(max_val,partial=46)[46]
[1] 20.16
We also see that there are 10 states that have the lowest precipitation measurements in 2021, at 0 inches. It is possible for there to be no rain at all during some months. The 10 states that have precipitation measurements of 0 at some point in 2021 are Arizona, California, Colorado, Kansas, Nevada, New Mexico, Oklahoma, Oregon, Texas, and Washington.
#Lowest precipitation measurement in 2021:
min(min_val)
[1] 0
#Number of states which have a precipitation measurement of 0:
sum(min_val == 0)
[1] 10
#State ID's for states that had precipitation measurements of 0 in 2021
zero <- c()
for (i in 1:49){
if (min_val[i] == 0){
zero <- append(zero, i)
}
}
print("State ID's associated with precipitation measurements of 0 in 2021: ")
[1] "State ID's associated with precipitation measurements of 0 in 2021: "
print(zero)
[1] 2 4 5 14 26 29 34 35 41 45
I now specifically address the research question. I want to find the state that had the highest range in precipitation, meaning that their rainy season was the most drastically different from their dry season.
#Largest range in precipitation:
max(max_val-min_val)
[1] 28.63
#State associated with largest range in precipitation: "
which.max(max_val-min_val)
[1] 49
range <- max_val - min_val
#Second highest range in precipitation :
sort(range, partial=48)[48]
[1] 27.09
#Third highest range in precipitation :
sort(range, partial=47)[47]
[1] 20.89
#Fourth highest range in precipitation :
sort(range, partial=46)[46]
[1] 19.79
In the code above, I found that state #49 had the largest range in precipitation, meaning that the difference between their highest and lowest recorded precipitation rate was the greatest out of all the states during 2021, at 28.63 inches.
This was followed by Washington (27.09 in.), Oregon (20.89 in.), and Louisiana (19.79 in.).
All three of the states with the largest range in precipitation, especially Alaska and Oregon, have very high annual rates of snowfall. This could definitely be contributing to the annual precipitation measurements. Thus, to isolate rainfall as a variable of interest, I choose to focus on Louisiana, which has a warm climate and tends to receive less than an inch of snow every year (Snow Climatology).
As mentioned, Louisiana has the fourth largest range in precipitation. As it turns out, Louisiana has a significant rainy season. In fact, WorldAtlas.com explains that Louisiana tends to be the second rainiest state in the entire United States. Hawaii is the first rainiest, but upon checking the code book Hawaii is the only state which is not represented in the data set. Overall, these findings are exciting because they suggest that my analysis is reasonable.
It is also interesting to obtain the mean of the measurements in all the divisions in Louisiana during 2021. I will perform this function below:
mean
1 4.829548
The mean precipitation value for Louisiana in June 2021 (over all the divisions) was 4.8295 inches. This suggests that there is a wide range in the amount of precipitation over the state, as we earlier learned that the highest precipitation measurement was around 20.16 inches. It seems that there are high-lying outliers in Louisiana in June.
Visualizations
Based on the fact that Louisiana has the fourth highest range in precipitation over the course of 2021, I wanted to see if there are particular months during which the rain tends to be much more extreme. I plot the data for June and November in Louisiana, from 2000 to 2021, below.
Plot A. Comparison of Rainfall in Louisiana in June and November from 2000 to 2021
#Create a data frame which is the result of subsetting on data from November 2021.
nov_louis <- climate9 %>%
group_by(year, state)%>%
filter(state == '16', (year == '2021' || year == '2020' || year == '2019' || year == '2018' || year == '2017' || year == '2016' || year == '2015' || year == '2014' || year == '2013' || year == '2012' || year == '2011' || year == '2010'|| year == '2009' || year == '2008' || year == '2007' || year == '2006' || year == '2005' || year == '2004'|| year == '2003'|| year == '2002'|| year == '2001'|| year == '2000')) %>%
select('November')
nov_louis$month = "November"
nov_louis$precipitation = nov_louis$November
nov_louis <- subset(nov_louis, select = c('month', 'precipitation'))
#Create a data frame which is the result of subsetting on data from June 2021.
june_louis <- climate9 %>%
group_by(year, state)%>%
filter(state == '16' & (year == '2021' || year == '2020' || year == '2019' || year == '2018' || year == '2017' || year == '2016' || year == '2015' || year == '2014' || year == '2013' || year == '2012' || year == '2011' || year == '2010'|| year == '2009' || year == '2008' || year == '2007' || year == '2006' || year == '2005' || year == '2004'|| year == '2003'|| year == '2002'|| year == '2001'|| year == '2000')) %>%
select('June')
june_louis$month = "June"
june_louis$precipitation = june_louis$June
june_louis <- subset(june_louis, select = c('month', 'precipitation'))
#Bind the November and June data frames together.
june_nov_df <- rbind(june_louis, nov_louis)
#Assign colors.
colors = c(June="pink", November="lightblue")
#Plot the data as 2 histograms laid over each other, with a legend.
plotA = ggplot(june_nov_df, aes(precipitation, fill=month)) +
theme_minimal() +
geom_bar(stat = "count") +
scale_fill_manual(values=colors) + ylim(0, 10) +
labs(title = "Louisiana Precipitation in June and November (2000-2021)", x = "Precipitation Measurements")
plotA
It does appear that overall there tended to be more Louisiana precipitation (including more extremely high precipitation measurements) in June than in November.
Plot A is the product of two simple histograms layered over one another. One histogram (in light blue) represents November precipitation measurements in Louisiana from 2000 to 2021. The other histogram (in pink) represents June precipitation measurements in Louisiana from 2000 to 2021.
I was curious to know whether there would be a visible difference in the amount of precipitation recorded in June versus in November. June is historically considered the rainiest month in Louisiana, especially in certain parts of the state (Weather & Climate). Overall, though there is not an extreme difference, the June values appear to have a wider range and a greater right skew, with far more values which lie above the mean and more large outliers. Previous exploration suggested that Louisiana was the state for which there was the greatest variation in the amount of precipitation which fell during different months of the year. Thus, it makes sense to see that there was a visible difference in the trends of precipitation recorded during June versus that recorded in November in Louisiana.
The other research question addresses whether Louisiana has changed its precipitation patterns in the years since 2000.
Plot B. Rainfall in Louisiana in June from 2000 to 2021
june_louis <- climate9 %>%
group_by(year, state)%>%
filter(state == 16) %>%
select('year', 'June') %>%
summarize(mean = mean(June)) %>%
select(mean)
june_louis_v <- june_louis$mean
year <- as.factor(2000:2021)
june_louis_df <- data.frame(year, june_louis_v[1:22])
plotB <- ggplot(june_louis_df, aes(x= year, y = june_louis_v[1:22])) + geom_bar(stat = "identity") + labs(x= "Year", y = "Mean Precipitation value in June", title = "June Precipitation Measurements in Louisiana from 2000 to 2021") + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = -1))
plotB
In this plot, I incorporate two variables: precipitation in Louisiana, and time. More specifically, in the plot above, the x-axis represents the year, while the y-axis represents June precipitation measurements collected in that year. The question I was curious to answer was whether or not there were significant changes in the June precipitation trends from 2000 to 2021. Based on a general analysis of the plot, it is possible that Louisiana has had less rain in recent years. However, more analysis would have to be conducted to verify if this was a significant difference.
I will also display June data in Louisiana from 1895 to 1910, and then 2000 to 2021, in the form of a facet wrap plot.
library(tidyverse)
june_louis <- climate9 %>%
group_by(year, state)%>%
filter(state == 16) %>%
select('year', 'June') %>%
summarize(mean = mean(June)) %>%
select(mean)
june_louis_v <- june_louis$mean
year_list1 <- c(1895:1916)
year_list2 <- c(2000:2021)
year_f1 <- as.factor(year_list1)
year_f2 <- as.factor(year_list2)
june_louis_v1 <- june_louis_v[1:22]
june_louis_v2 <- june_louis_v[105:126]
june_louis_df1 <- data.frame(year = year_f1, precipitation = june_louis_v1, category = "Era 1")
june_louis_df2 <- data.frame(year = year_f2, precipitation = june_louis_v2, category = "Era 2")
june_louis_df <- rbind(june_louis_df1, june_louis_df2)
ggplot(data = june_louis_df, aes(year, precipitation)) + geom_bar(stat = "identity") + labs(x= "Year", y = "Mean Precipitation Value in June", title = "June Precipitation Measurements in Louisiana Across Eras") + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = -1)) + facet_wrap(~ category, scales = "free_x")
Overall, the rainfall in the second period, from 2000 to 2021, seems to fluctuate at a higher range than that in the period from 1895 to 1916. In particular, the precipitation in Louisiana patterns seem to be abnormally high from around 2002 to 2005.
Reflection
Weather is not an easy subject of analysis. Though it is relatively quantifiable, especially precipitation, it is highly variable and subject to many factors, many of which we are unlikely to assess, like the temperature of the upper layers of the atmosphere. However, data analytics makes it possible to synthesize very large amounts of information and come to useful conclusions.
One of the most challenging parts of this project was the data tidying process. At the beginning, it is very difficult to know what research questions will prove to be most relevant. As a result, deciding what structure the data frame should take is not always obvious. This data set in particular was far more complex and difficult to interpret than any other one I have had experience with. In most classes, for instance, the data set is already tidy, with columns that represent variables and rows which represent observations. In this data set, not only was this not the case, but it was also not very clear what should be considered variables. For instance, it would have been possible to consider ‘month’ as a single categorical variable, with 12 levels, or to consider the precipitation value for each month as its own observation.
Even after recoding the variables based on the ID’s in the row names, I ran into challenges. While analyzing the data, for instance, I noticed that certain outputs did not make sense contextually. Upon further investigation, I found that this was because the data associated with 2022 contained negative values, and had to be eliminated. Overall, this and other difficulties taught me that it is very important to thoroughly familiarize yourself with the data. It is helpful to plot variables as part of a general exploratory analysis, to get a general sense of their distributions as well as any outliers.
One piece of advice I wish I had known while working through this project is that one should never assume, without checking, that the code is working properly, even if it is producing an output which looks reasonable. It is important to debug the code carefully, checking at multiple points within each code chunk to make sure that it is performing the task that you want it to, as well as ascertaining that none of the outputs look unusual. This is because several days after writing the code to determine which states had the greatest range in precipitation measurements, and analyzing my data based on its output, I debugged the code more thoroughly and realized I had made a mistake. Even though it was producing a reasonable output, it was not performing the function I wanted it to. After making some changes, I was much more confident in my numbers.
If I were to continue with my work, the natural trajectory would be to continue to uncover possible trends in erratic weather in certain states in the US. My project performed preliminary analysis on the trends over time in Louisiana, which I identified to be one of the states with the most dramatic highs and lows in precipitation in the country. If I were to continue with this project, I would isolate other states with large ranges in precipitation and perform more extensive analysis on the trends since 1895. It would be very informative to incorporate statistical analysis and comparisons between the amount of precipitation in past years and now.
Conclusion
My project was based on two main research questions. The first goal was to identify the states in the US with the most erratic weather. I identified Alaska, Oregon, Washington, and Louisiana as the states with the greatest range in precipitation, with measurements of 28.78, 27.09, 20.89, and 20.16 inches, respectively.
The second question was to address whether states with erratic weather patterns have changed over the past years. Through visual depictions of these changes, I noticed evidence which helps support the possibility that the precipitation has increased in the years since 1895, particularly in the very rainy state of Louisiana.
Further research should focus on states with turbulent weather patterns, to uncover the presence of possible changes which may be associated with climate change.
Bibliography
Data set: “NOAA Monthly U.S. Climate Divisional Precipitation Database.” National Centers for Environmental Information., 31 Mar. 2022, https://www.ncei.noaa.gov/data/climdiv/.
“Climate and Average Weather Year Round in Alabama.” Weatherspark.com, https://weatherspark.com/y/20416/Average-Weather-in-Alabama-New-York-United-States-Year-Round#:~:text=The%20chance%20of%20wet%20days,least%200.04%20inches%20of%20precipitation.
Nag, Oishimaya Sen. “The 10 Wettest States in the United States of America.” WorldAtlas, WorldAtlas, 24 Apr. 2019, https://www.worldatlas.com/articles/the-10-wettest-states-in-the-united-states-of-america.html.
US Department of Commerce, NOAA. “NWS Lix - Snow Climatology.” National Weather Service, NOAA’s National Weather Service, 17 Nov. 2021, https://www.weather.gov/lix/snowcli.
Wickham, H., & Grolemund, G. (2016). R for data science: Visualize, model, transform, tidy, and import data. OReilly Media.
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Laidler (2022, May 19). Data Analytics and Computational Social Science: United States Trends in Turbulent Weather. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httprpubscomericalaidler901831/
BibTeX citation
@misc{laidler2022united, author = {Laidler, Erica}, title = {Data Analytics and Computational Social Science: United States Trends in Turbulent Weather}, url = {https://github.com/DACSS/dacss_course_website/posts/httprpubscomericalaidler901831/}, year = {2022} }