Homework 3: Diving into Data on Atlantic Hurricanes from 1920-2020

hw3
atlantic_hurricanes
hunter_major
Homework 3 Assignment on Atlantic Hurricanes Data
Author

Hunter Major

Published

June 29, 2023

Code
# loading packages into R Studio


library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.2     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.2     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Code
library(dplyr)
library(readr)
library(readxl)
library(tidyr)
library(googlesheets4)
library(stringr)
library(lubridate)
library(ggplot2)


knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Homework 3 Overview

For this homework, your goal is to read in a more complicated dataset. Please use the category tag “hw2” as well as a tag for the dataset you choose to use.

Read in a dataset. It’s strongly recommended that you choose a dataset you’re considering using for the final project. If you decide to use one of the datasets we have provided, please use a challenging dataset - check with us if you are not sure.

Clean the data as needed using dplyr and related tidyverse packages. Provide a narrative about the data set (look it up if you aren’t sure what you have got) and the variables in your dataset, including what type of data each variable is. The goal of this step is to communicate in a visually appealing way to non-experts - not to replicate r-code. Identify potential research questions that your dataset can help answer.

Code
# reading in the dataset

atlantic_hurricanes <- 
read_csv("_mysampledatasets/atlantic_hurricanes.csv")


atlantic_hurricanes

Narrative About The Data

Code
summary(atlantic_hurricanes)
      ...1           Name             Duration          Wind speed       
 Min.   :  0.0   Length:458         Length:458         Length:458        
 1st Qu.:114.2   Class :character   Class :character   Class :character  
 Median :228.5   Mode  :character   Mode  :character   Mode  :character  
 Mean   :228.5                                                           
 3rd Qu.:342.8                                                           
 Max.   :457.0                                                           
   Pressure         Areas affected        Deaths             Damage         
 Length:458         Length:458         Length:458         Length:458        
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
                                                                            
                                                                            
                                                                            
     REf               Category    
 Length:458         Min.   :1.000  
 Class :character   1st Qu.:1.000  
 Mode  :character   Median :2.000  
                    Mean   :2.037  
                    3rd Qu.:3.000  
                    Max.   :5.000  
Code
dim(atlantic_hurricanes)
[1] 458  10

This dataset provides insights into Atlantic hurricanes, hurricanes that developed in the Atlantic Ocean area, across a 100 year time period, from 1920 to 2020. This dataset lists hurricanes that fall under the Category 1, Category 2, Category 3, and Category 5 classifications; therefore, it doesn’t include Category 4 hurricanes nor does it include (storms that didn’t develop beyond) tropical storms and tropical depressions. 458 hurricanes or observations/rows are included in this dataset. There are 10 variables. Variables in the original version of the data are:

  • …1 or X (or the list number/ID of the hurricane as entered into the dataset, mostly used for organizational, data entry purposes)

  • the name of the hurricane (character value),

  • the duration of the hurricane/the dates that it occurred (numeric/character value?),

  • the wind speed of the hurricane (in miles per hour and kilometers per hour) (numeric value),

  • the pressure of the hurricane (in atmospheric pressure-hPa and in inch of mercury-inHg) (numeric value),

  • the number of deaths caused by the hurricane (numeric value),

  • the amount of damage in US dollars caused by the hurricane (numeric value),

  • the category of the hurricane (Cat 1, 2, 3, or 5) (number representing a character value?),

  • the numerically assigned references/footnotes that provide further information about the hurricane (number representing a character value?.

**Using the summary () function, most of the values for each variable are listed as character values because they do include words, abbreviations, units of measurement, other symbols/punctuation in them currently, in/after the process of tidying the dataset, I hope to make sure that all of the variables that I’ve listed as having numeric values in paratheses above will be then ‘understood’/interpreted by R as having numeric values. However, because I am a beginner in R–I don’t know if I’ll be able to accomplish all of what I perceive has to be cleaned within the suite of Homework 2 alone–as I will likely be working with this dataset (outside of the challenge assignments) all the way through to the final project, but I’ll try my best:)

Data Cleaning

1. Removing REf Column and the …1 or X Column

I am removing the REf column because I am unclear on what it represents, and I do not believe it will be useful for my purposes in tidying and working towards analyzing the snapshot this dataset provides into Atlantic hurricanes more broadly. I believe REf is potentially referring to listed/numbered footnote references in the original study. I sourced this dataset from Kaggle, https://www.kaggle.com/datasets/valery2042/hurricanes, and do not currently have access to the original study. The only information provided about sources on the dataset’s Kaggle page is, “I scraped Wikipedia pages of Atlantic hurricanes of Categoris 1,2,3 and 5 using pandas/html” (Liamtsau, 2022). I am removing the ..1/X column as well because this is simply the list/ID number of each hurricane as it is entered into the dataset, and since R Studio maintains its own list/ID number on the far left of the table I believe the X variable is no longer necessary. Also, since the first number in the X column is 0 for the first hurricane listed instead of 1, this can be confusing for some readers whose numbering convention starts with 1. Scrolling all the way to the end of the table (page 46 for me), we can see that the last value listed in the X column is 457, which is a slight mismatch from the 458 rows/observation values, which represented the total number of hurricanes included in the study, that R computed the dataset to have.

Code
# remove column named REf and the X Column

atlantic_hurricanes2 <- atlantic_hurricanes %>% select(-c(...1,REf))
  

atlantic_hurricanes2 

Looks like the REf and X (or …1) columns were successfully removed! There should now be 8 columns.

2. Separate Wind.speed into Wind.speed.mph and Wind.speed.kmh and Pressure into Pressure.hPa and Pressure.inHg

In the current version of the dataset, within the Wind.speed column, values for each hurricane’s wind speed are provided in miles per hour (mph) and kilometers per hour (km/h) in the same cell. Likewise, values for each hurricane’s pressure are provided in hPa (atmospheric pressure) and inHg (inch of Mercury). I would like to separate those values, so each unit of measurement for the wind speed and pressure, respectively has their own distinct columns.

Code
# separate the Wind.speed column into Wind.speed.mph and Wind.speed.kmh

atlantic_hurricanes3 <- separate(atlantic_hurricanes2, `Wind speed`, into = c("Wind.speed.mph", "Wind.speed.kmh"), sep = "\\(")

atlantic_hurricanes3
Code
# separate Pressure column into Pressure.hPa and Pressure.inHg

atlantic_hurricanes4 <- separate(atlantic_hurricanes3, Pressure, into = c("Pressure.hPa", "Pressure.inHg"), sep = " ")

atlantic_hurricanes4

Looks like each unit of measurement for a hurricane’s wind speed (Wind.speed.mph and Wind.speed.kmh) and a hurricane’s pressure (Pressure.hPa and Pressure.inHg) now have their own distinct columns!

3. Removing measurement unit abbreviations and unneeded parentheses from values in the Wind.speed.mph, Wind.speed.kmh, Pressure.hPa, and Pressure.inHg columns

I would like to remove the measurement unit abbreviations and unneeded parentheses from values in the Wind.speed.mph, Wind.speed.kmh, Pressure.hPa, and Pressure.inHg columns so that only the numbers/numeric values remain. Once R reads these columns as have numeric values, I’ll be able to run summary statistics and other relevant numeric related functions using them that’ll provide useful information to analyze.

Code
# removing "mph" from the end of values in the Wind.speed.mph column

atlantic_hurricanes5 <- mutate(atlantic_hurricanes4, Wind.speed.mph = as.numeric(str_extract(Wind.speed.mph,pattern="[:digit:]+")))

atlantic_hurricanes5
Code
# removing "km/h)" from the end of values in the Wind.speed.kmh column

atlantic_hurricanes6 <- mutate(atlantic_hurricanes5, Wind.speed.kmh = as.numeric(str_extract(Wind.speed.kmh,pattern = "[:digit:]+")))

atlantic_hurricanes6
Code
# removing commas from values in Pressure.hPa
# removing "hPa" from the end of values in the Pressure.hPa column

atlantic_hurricanes7<- mutate(atlantic_hurricanes6, Pressure.hPa = str_remove(Pressure.hPa, ","),
                              Pressure.hPa= as.numeric(str_extract(Pressure.hPa,pattern = "[:digit:]+")))
atlantic_hurricanes7
Code
# removing "(" and "inHg)" from values in Pressure.inHg column

#areas affected
n_areas_max <- max(str_count(atlantic_hurricanes7[!is.na(atlantic_hurricanes7$`Areas affected`),]$`Areas affected`, "[a-z],"))+2

#separate areas affected into multiple columns then pivot longer into one column, called area
atlantic_hurricanes8 <- atlantic_hurricanes7 %>%
  separate(`Areas affected`, into = paste0("a",1:n_areas_max), sep = ",") %>%
  pivot_longer(c(a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11),names_to = "del", values_to= "area") %>%
  select(-del)%>%
  filter(!is.na(area))
atlantic_hurricanes8
Code
write_csv(atlantic_hurricanes8, "atlantic_hurricanes8.csv")

Reading Back in the Data Set after manually changing small things in Google Sheets

I made a few minor changes to the atlantic_hurricanes8 dataset in Google Sheets such as removing cross signs (a special character) from the end of some duration dates, changing the dashes in the duration column from double dash (–, a special character) to a single dash (-), duplicating rows to separate two or more different affected areas that previously were not listed with spaces between them and changing the name of the Category column to Max.category.

Code
# reading back in data set after manually changing some small things in Google Sheets
atlantic_hurricanes9 <- read_csv("_mysampledatasets/atlantic_hurricanes8_GoogleSheetsVersion.csv")
atlantic_hurricanes9

Additional Cleaning

Removing Wind.speed.kmh and Pressure.inHg columns and renaming Wind.speed.mph to Max.wind.speed.mph and Pressure.hPa into Max.pressure.hPa

Code
# Deleting Wind.speed.kmh and Pressure.inHg columns so that there's only one measure for wind speed (mph) and one measure for pressure (hPa)

atlantic_hurricanes10 <- atlantic_hurricanes9 %>%
  select(-c(Wind.speed.kmh, Pressure.inHg))
atlantic_hurricanes10
Code
atlantic_hurricanes11 <- atlantic_hurricanes10 %>%
  mutate(Start_Date=str_c(str_extract(Duration,"[:alpha:]+ [:digit:]+(?=,)"),
      str_extract(Duration,", [:digit:]+")))
atlantic_hurricanes11
Code
# renaming Wind.speed.mph to Max.wind.speed.mph and Pressure.hPa to Max.pressure.hPa
atlantic_hurricanes12 <- atlantic_hurricanes11 %>%
  rename(Max.wind.speed.mph=Wind.speed.mph)%>%
  rename(Max.pressure.hPa=Pressure.hPa)
atlantic_hurricanes12

Looks like Wind.speed.kmh and Pressure.inHg were removed successfully!

Tidying the Deaths column: changing “None” values to 0 and “Unknown” values to NA and changing Deaths column to read as a numeric value

Code
# change values that read in Deaths column as "None" to the number 0
atlantic_hurricanes12$Deaths <- str_replace(atlantic_hurricanes12$Deaths, "None", "0")
atlantic_hurricanes12
Code
# change values in Deaths column that read as "Unknown" to NA
atlantic_hurricanes12$Deaths <- na_if(atlantic_hurricanes12$Deaths, "Unknown")
atlantic_hurricanes12
Code
# change values in Deaths column that contain >1,000 to NA
atlantic_hurricanes12$Deaths <- na_if(atlantic_hurricanes12$Deaths, ">1,000")
atlantic_hurricanes12
Code
# change Deaths column to read as a numeric variable
atlantic_hurricanes13 <- transform(atlantic_hurricanes12, Deaths = as.numeric(Deaths))
atlantic_hurricanes13

As expected, looks like the Deaths column now reads as a numeric variable and the “None” value has been switched to 0 and the “Unknown” value has been switched to NA!

Tidying the Damage column

Currently, there are a handful of values in the Damage column that will not make for the clearest analysis. All of the values contained in the Damage column can be seen when running the unique () function, which I will do below. I will change values that do not make for the clearest analysis/are less straightforward to NA. I will also remove the dollar signs, assuming that the creator of this dataset used USD for all monetary values. This is part of the process of having Damage eventually read as a numeric variable–so I can compute summary statistics and do visualizations off of the dollar amount. Like the Deaths column, there is also a value in the Damage column called “None” but instead of changing that to 0 I will be changing that to NA because I find it hard to believe that a hurricane (even of a lower intensity) caused zero damage (as a qualitative descriptor) and/or $0 worth of damage as a more quantitative descriptor.

Code
# checking for all unique values in the Damage column
unique(atlantic_hurricanes13$Damage)
  [1] NA                                             
  [2] "$19 million"                                  
  [3] "$4.36 million"                                
  [4] "$2 million"                                   
  [5] "$365 thousand"                                
  [6] "$226 thousand"                                
  [7] "$500 thousand"                                
  [8] "$3 million"                                   
  [9] "$550 thousand"                                
 [10] "$2 thousand"                                  
 [11] "Unknown"                                      
 [12] "$790 thousand"                                
 [13] "$800 thousand"                                
 [14] "$900 thousand"                                
 [15] "$1.012 million"                               
 [16] "$1.3 million"                                 
 [17] "$250 thousand"                                
 [18] "$623 thousand"                                
 [19] "$50 thousand"                                 
 [20] "$24.9 million"                                
 [21] "$2.5 million"                                 
 [22] "$75 thousand"                                 
 [23] "$7 million"                                   
 [24] "$640 thousand"                                
 [25] "$50 million"                                  
 [26] "$1 million"                                   
 [27] "$150 thousand"                                
 [28] "$450,000"                                     
 [29] "$18.7 million"                                
 [30] "$30 million"                                  
 [31] "$5.1 million"                                 
 [32] "$30.2 million"                                
 [33] "$2.1 billion"                                 
 [34] "$6.2 million"                                 
 [35] "$13 million"                                  
 [36] "$6 million"                                   
 [37] "$20 million"                                  
 [38] "$15 million"                                  
 [39] "$85 million"                                  
 [40] "$152 million"                                 
 [41] "$100 million"                                 
 [42] "$1.5 billion"                                 
 [43] "$42 million"                                  
 [44] "$8 thousand"                                  
 [45] "$2.9 million"                                 
 [46] "$70 million"                                  
 [47] "$3.91 million"                                
 [48] "$200 million"                                 
 [49] "$594 million"                                 
 [50] "$1.7 million"                                 
 [51] "$203 million"                                 
 [52] "$8.2 million"                                 
 [53] "$735 thousand"                                
 [54] "$10.8 million"                                
 [55] "> 230 million"                                
 [56] "$1.4 million"                                 
 [57] "$5 million"                                   
 [58] "$181 million"                                 
 [59] "$100 thousand"                                
 [60] "$130 million"                                 
 [61] "$320 million"                                 
 [62] "$3.96 billion"                                
 [63] "$500 million"                                 
 [64] "$200 thousand"                                
 [65] "$92 million"                                  
 [66] "$580 million"                                 
 [67] "$160 million"                                 
 [68] "$57.1 million"                                
 [69] "None"                                         
 [70] "Heavy"                                        
 [71] "Minor"                                        
 [72] "$235 thousand"                                
 [73] "Moderate"                                     
 [74] "$7.5 million"                                 
 [75] "Millions"                                     
 [76] "$4.4 million"                                 
 [77] "$10 thousand"                                 
 [78] "$5.5 million"                                 
 [79] "$1.5 million"                                 
 [80] "$10.75 million"                               
 [81] "$4.05 million"                                
 [82] "$1.49 million"                                
 [83] "$9 million"                                   
 [84] "$17 million"                                  
 [85] "$419 thousand"                                
 [86] "$300 thousand"                                
 [87] "$750 thousand"                                
 [88] "$5.2 million"                                 
 [89] "$3.26 million"                                
 [90] "$6.7 million"                                 
 [91] "$21.7 million"                                
 [92] "$754.7 million"                               
 [93] "$3.58 million"                                
 [94] ">$100 thousand"                               
 [95] "$1.1 million"                                 
 [96] "$46.6 million"                                
 [97] "Minimal"                                      
 [98] "$10 million"                                  
 [99] "$1.8 billion"                                 
[100] "$8.9 million"                                 
[101] "Extensive"                                    
[102] "13"                                           
[103] "5"                                            
[104] "[11][100][101][102]"                          
[105] "[108][109]"                                   
[106] "[131][132][133]"                              
[107] "$1.3 billion"                                 
[108] "[142]"                                        
[109] "[144]"                                        
[110] "[159][160][161][162][163][164][165][166][167]"
[111] "[174]"                                        
[112] "$25 million"                                  
[113] "$4.4 billion"                                 
[114] "$513 million"                                 
[115] "$80 million"                                  
[116] "$40 million"                                  
[117] "$$27.9 million"                               
[118] "$306 million"                                 
[119] "$65.8 million"                                
[120] "$60.3 million"                                
[121] "$229 million"                                 
[122] "$208 million"                                 
[123] "$1.42 billion"                                
[124] "$25.4 million"                                
[125] "$1.54 billion"                                
[126] "$1.24 billion"                                
[127] "$7.1 billion"                                 
[128] "$10 billion"                                  
[129] "$26.5 billion"                                
[130] "$6.2 billion"                                 
[131] "$5.37 billion"                                
[132] "$23.3 billion"                                
[133] "$1.01 billion"                                
[134] "$125 billion"                                 
[135] "$12 billion"                                  
[136] "$29.4 billion"                                
[137] "$1.76 billion"                                
[138] "$720 million"                                 
[139] "$15.1 billion"                                
[140] "$64.8 billion"                                
[141] "$91.6 billion"                                
[142] "$25.1 billion"                                
[143] "$5 billion"                                   
[144] "$362 million"                                 
Code
# changing less clear/straightforward values to NA

atlantic_hurricanes13$Damage[startsWith(atlantic_hurricanes13$Damage, ">")] <- NA

atlantic_hurricanes13$Damage[startsWith(atlantic_hurricanes13$Damage, "[")] <- NA

atlantic_hurricanes13$Damage[startsWith(atlantic_hurricanes13$Damage, "M")] <- NA

atlantic_hurricanes13$Damage[startsWith(atlantic_hurricanes13$Damage, "H")] <- NA

atlantic_hurricanes13$Damage[startsWith(atlantic_hurricanes13$Damage, "U")] <- NA

atlantic_hurricanes13$Damage[startsWith(atlantic_hurricanes13$Damage, "None")] <- NA

atlantic_hurricanes13$Damage[startsWith(atlantic_hurricanes13$Damage, "E")] <- NA

atlantic_hurricanes13$Damage[startsWith(atlantic_hurricanes13$Damage, "13")] <- NA

atlantic_hurricanes13$Damage[startsWith(atlantic_hurricanes13$Damage, "5")] <- NA
Code
atlantic_hurricanes14 <- atlantic_hurricanes13
atlantic_hurricanes14
Code
unique(atlantic_hurricanes14$Damage)
  [1] NA               "$19 million"    "$4.36 million"  "$2 million"    
  [5] "$365 thousand"  "$226 thousand"  "$500 thousand"  "$3 million"    
  [9] "$550 thousand"  "$2 thousand"    "$790 thousand"  "$800 thousand" 
 [13] "$900 thousand"  "$1.012 million" "$1.3 million"   "$250 thousand" 
 [17] "$623 thousand"  "$50 thousand"   "$24.9 million"  "$2.5 million"  
 [21] "$75 thousand"   "$7 million"     "$640 thousand"  "$50 million"   
 [25] "$1 million"     "$150 thousand"  "$450,000"       "$18.7 million" 
 [29] "$30 million"    "$5.1 million"   "$30.2 million"  "$2.1 billion"  
 [33] "$6.2 million"   "$13 million"    "$6 million"     "$20 million"   
 [37] "$15 million"    "$85 million"    "$152 million"   "$100 million"  
 [41] "$1.5 billion"   "$42 million"    "$8 thousand"    "$2.9 million"  
 [45] "$70 million"    "$3.91 million"  "$200 million"   "$594 million"  
 [49] "$1.7 million"   "$203 million"   "$8.2 million"   "$735 thousand" 
 [53] "$10.8 million"  "$1.4 million"   "$5 million"     "$181 million"  
 [57] "$100 thousand"  "$130 million"   "$320 million"   "$3.96 billion" 
 [61] "$500 million"   "$200 thousand"  "$92 million"    "$580 million"  
 [65] "$160 million"   "$57.1 million"  "$235 thousand"  "$7.5 million"  
 [69] "$4.4 million"   "$10 thousand"   "$5.5 million"   "$1.5 million"  
 [73] "$10.75 million" "$4.05 million"  "$1.49 million"  "$9 million"    
 [77] "$17 million"    "$419 thousand"  "$300 thousand"  "$750 thousand" 
 [81] "$5.2 million"   "$3.26 million"  "$6.7 million"   "$21.7 million" 
 [85] "$754.7 million" "$3.58 million"  "$1.1 million"   "$46.6 million" 
 [89] "$10 million"    "$1.8 billion"   "$8.9 million"   "$1.3 billion"  
 [93] "$25 million"    "$4.4 billion"   "$513 million"   "$80 million"   
 [97] "$40 million"    "$$27.9 million" "$306 million"   "$65.8 million" 
[101] "$60.3 million"  "$229 million"   "$208 million"   "$1.42 billion" 
[105] "$25.4 million"  "$1.54 billion"  "$1.24 billion"  "$7.1 billion"  
[109] "$10 billion"    "$26.5 billion"  "$6.2 billion"   "$5.37 billion" 
[113] "$23.3 billion"  "$1.01 billion"  "$125 billion"   "$12 billion"   
[117] "$29.4 billion"  "$1.76 billion"  "$720 million"   "$15.1 billion" 
[121] "$64.8 billion"  "$91.6 billion"  "$25.1 billion"  "$5 billion"    
[125] "$362 million"  

Looks like all of the less clear/less straightforward values in the Damage column have been removed!

Code
# separating Damage into Damage.amount and Damage.unit
atlantic_hurricanes15 <- atlantic_hurricanes14%>%
  separate(Damage, c("Damage.$.amount", "Damage.$.unit"), " ")
atlantic_hurricanes15
Code
# removing $ dollar sign from Damage.$.amount column, making Damage.$.amount column a numeric variable
atlantic_hurricanes15$`Damage.$.amount`= as.numeric(gsub("\\$", "", atlantic_hurricanes15$`Damage.$.amount`))
atlantic_hurricanes15

Looks like the Damage column has been separated into two different columns, and the Damage.$.amount column is now a numeric variable!

Revisiting/Further Cleaning the area column

I’d like to change “No land areas” and “None” values within the area column to NA; with that said, in order to make that change, I think it’s fitting to rename the area column to Land.areas.affected, so NA in that case could mean the land areas affected were/are unknown or there were no land areas affected, as in the hurricane system only remained in open waters and did not formally make landfall at any land-based location/territory.

Code
# Renaming area to Land.areas.affected
atlantic_hurricanes16 <- atlantic_hurricanes15 %>%
  rename(Land.areas.affected=area)
atlantic_hurricanes16
Code
# viewing all unique values in Land.areas.affected
unique(atlantic_hurricanes16$Land.areas.affected)
  [1] "Central America"                    "Gulf of Mexico"                    
  [3] "Mexico"                             "None"                              
  [5] "Newfoundland"                       "Gulf Coast of the United States"   
  [7] "United States East Coast"           "Cuba"                              
  [9] "The Bahamas"                        "Cape Verde"                        
 [11] "Windward Islands"                   "Leeward Islands"                   
 [13] "United States Gulf Coast"           "Azores"                            
 [15] "Bermuda"                            "Newfoundland and Labrador"         
 [17] "Jamaica"                            "Haiti"                             
 [19] "Texas"                              "Yucatán Peninsula"                 
 [21] "Tamaulipas"                         "Veracruz"                          
 [23] "Puerto Rico"                        "Turks and Caicos Islands"          
 [25] "Eastern United States"              "Honduras"                          
 [27] "The Caribbean"                      "Nicaragua"                         
 [29] "North Carolina"                     "Mid-Atlantic States"               
 [31] "Belize"                             "Guatemala"                         
 [33] "Louisiana"                          "Mississippi"                       
 [35] "Midwestern United States"           "Saint Croix"                       
 [37] "Dominican Republic"                 "Georgia"                           
 [39] "Southwestern Florida"               "Florida"                           
 [41] "Bahamas"                            "South Carolina"                    
 [43] "Virginia"                           "Martinique"                        
 [45] "Saint Lucia"                        "Hispaniola"                        
 [47] "Atlantic Canada"                    "Virgin Islands"                    
 [49] "Sable Island"                       "Saba"                              
 [51] "Anguilla"                           "Lesser Antilles"                   
 [53] "Western Mexico"                     "Alabama"                           
 [55] "The Carolinas"                      "New England"                       
 [57] "Canadian Maritime Provinces"        "Oklahoma"                          
 [59] "Leeward Antilles"                   "Greater Antilles"                  
 [61] "Northeastern United States"         "Nova Scotia"                       
 [63] "Ireland"                            "United Kingdom"                    
 [65] "Norway"                             "Soviet Union"                      
 [67] "British Isles"                      "Quebec"                            
 [69] "Panama"                             "Costa Rica"                        
 [71] "Cayman Islands"                     "Southeastern United States"        
 [73] "northern Mexico"                    "western Cuba"                      
 [75] "Florida Panhandle"                  "Maryland"                          
 [77] "Pennsylvania"                       "New York"                          
 [79] "Maine"                              "Tennessee"                         
 [81] "North Carolina and Virginia"        "St. Lucia"                         
 [83] "Barbados"                           "Grenada"                           
 [85] "central United States"              "Canada"                            
 [87] "Eastern Canada"                     "Iberian Peninsula"                 
 [89] "Trinidad and Tobago"                "Venezuela"                         
 [91] "Colombia"                           "Socorro Island"                    
 [93] "El Salvador"                        "Northern Mexico"                   
 [95] "Southern Texas"                     "Delaware"                          
 [97] "Massachusetts"                      "Scotland"                          
 [99] "Eastern Coast of the United States" "Europe"                            
[101] "Madeira Islands"                    "Southern Portugal"                 
[103] "Southwestern Spain"                 "Iceland"                           
[105] "Greenland"                          "Central Mexico"                    
[107] "Western Europe"                     "Tabasco"                           
[109] "Guadeloupe"                         "Montserrat"                        
[111] "Saint Thomas"                       "Trinidad"                          
[113] "Quintana Roo"                       "Tampico"                           
[115] "Chiapas"                            "Arkansas"                          
[117] "Campeche"                           "New Jersey"                        
[119] "Dominica"                           "Spain"                             
[121] "France"                             "Antigua"                           
[123] "Barbuda"                            "South Central United States"       
[125] "Lucayan Archipelago"                "West Africa"                       
[127] "Faroe Islands"                      "East Coast of the United States"   
[129] "No land areas"                      "Central United States"             
[131] "Northern Europe"                    "Turks and Caicos"                  
[133] "Yucatán peninsula"                  "No Land Areas"                     
[135] "United States Virgin Islands"       "Northeastern Caribbean"            
[137] "West Virginia"                      "Great Britain"                     
[139] "Southeast Mexico"                   "Cape Verde Islands"                
[141] "Antilles"                           "Portugal"                          
[143] "Mid-Atlantic"                       "Southwestern Quebec"               
[145] "United States East coast"           "South Texas"                       
[147] "South Florida"                      "Ontario"                           
[149] "Cabo Verde"                        
Code
# changing "None" and "No land areas" and "No Land Areas" to NA
atlantic_hurricanes16$Land.areas.affected[startsWith(atlantic_hurricanes16$Land.areas.affected, "None")] <- NA

atlantic_hurricanes16$Land.areas.affected[startsWith(atlantic_hurricanes16$Land.areas.affected, "No land areas")] <- NA

atlantic_hurricanes16$Land.areas.affected[startsWith(atlantic_hurricanes16$Land.areas.affected, "No Land Areas")] <- NA

#re-checking for NA in Land.areas.affected column
atlantic_hurricanes17 <- atlantic_hurricanes16
atlantic_hurricanes17
Code
unique(atlantic_hurricanes17$Land.areas.affected)
  [1] "Central America"                    "Gulf of Mexico"                    
  [3] "Mexico"                             NA                                  
  [5] "Newfoundland"                       "Gulf Coast of the United States"   
  [7] "United States East Coast"           "Cuba"                              
  [9] "The Bahamas"                        "Cape Verde"                        
 [11] "Windward Islands"                   "Leeward Islands"                   
 [13] "United States Gulf Coast"           "Azores"                            
 [15] "Bermuda"                            "Newfoundland and Labrador"         
 [17] "Jamaica"                            "Haiti"                             
 [19] "Texas"                              "Yucatán Peninsula"                 
 [21] "Tamaulipas"                         "Veracruz"                          
 [23] "Puerto Rico"                        "Turks and Caicos Islands"          
 [25] "Eastern United States"              "Honduras"                          
 [27] "The Caribbean"                      "Nicaragua"                         
 [29] "North Carolina"                     "Mid-Atlantic States"               
 [31] "Belize"                             "Guatemala"                         
 [33] "Louisiana"                          "Mississippi"                       
 [35] "Midwestern United States"           "Saint Croix"                       
 [37] "Dominican Republic"                 "Georgia"                           
 [39] "Southwestern Florida"               "Florida"                           
 [41] "Bahamas"                            "South Carolina"                    
 [43] "Virginia"                           "Martinique"                        
 [45] "Saint Lucia"                        "Hispaniola"                        
 [47] "Atlantic Canada"                    "Virgin Islands"                    
 [49] "Sable Island"                       "Saba"                              
 [51] "Anguilla"                           "Lesser Antilles"                   
 [53] "Western Mexico"                     "Alabama"                           
 [55] "The Carolinas"                      "New England"                       
 [57] "Canadian Maritime Provinces"        "Oklahoma"                          
 [59] "Leeward Antilles"                   "Greater Antilles"                  
 [61] "Northeastern United States"         "Nova Scotia"                       
 [63] "Ireland"                            "United Kingdom"                    
 [65] "Norway"                             "Soviet Union"                      
 [67] "British Isles"                      "Quebec"                            
 [69] "Panama"                             "Costa Rica"                        
 [71] "Cayman Islands"                     "Southeastern United States"        
 [73] "northern Mexico"                    "western Cuba"                      
 [75] "Florida Panhandle"                  "Maryland"                          
 [77] "Pennsylvania"                       "New York"                          
 [79] "Maine"                              "Tennessee"                         
 [81] "North Carolina and Virginia"        "St. Lucia"                         
 [83] "Barbados"                           "Grenada"                           
 [85] "central United States"              "Canada"                            
 [87] "Eastern Canada"                     "Iberian Peninsula"                 
 [89] "Trinidad and Tobago"                "Venezuela"                         
 [91] "Colombia"                           "Socorro Island"                    
 [93] "El Salvador"                        "Northern Mexico"                   
 [95] "Southern Texas"                     "Delaware"                          
 [97] "Massachusetts"                      "Scotland"                          
 [99] "Eastern Coast of the United States" "Europe"                            
[101] "Madeira Islands"                    "Southern Portugal"                 
[103] "Southwestern Spain"                 "Iceland"                           
[105] "Greenland"                          "Central Mexico"                    
[107] "Western Europe"                     "Tabasco"                           
[109] "Guadeloupe"                         "Montserrat"                        
[111] "Saint Thomas"                       "Trinidad"                          
[113] "Quintana Roo"                       "Tampico"                           
[115] "Chiapas"                            "Arkansas"                          
[117] "Campeche"                           "New Jersey"                        
[119] "Dominica"                           "Spain"                             
[121] "France"                             "Antigua"                           
[123] "Barbuda"                            "South Central United States"       
[125] "Lucayan Archipelago"                "West Africa"                       
[127] "Faroe Islands"                      "East Coast of the United States"   
[129] "Central United States"              "Northern Europe"                   
[131] "Turks and Caicos"                   "Yucatán peninsula"                 
[133] "United States Virgin Islands"       "Northeastern Caribbean"            
[135] "West Virginia"                      "Great Britain"                     
[137] "Southeast Mexico"                   "Cape Verde Islands"                
[139] "Antilles"                           "Portugal"                          
[141] "Mid-Atlantic"                       "Southwestern Quebec"               
[143] "United States East coast"           "South Texas"                       
[145] "South Florida"                      "Ontario"                           
[147] "Cabo Verde"                        

Looks like modifications to the Land.areas.affected column were made successfully as expected!

Summary Statistics

summary stats about Max.wind.speed.mph

Code
# summary stats about Max.wind.speed.mph
summary(atlantic_hurricanes17$Max.wind.speed.mph)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   75.0    85.0   100.0   104.9   115.0   190.0 

summary stats about Max.pressure.hPa

Code
#summary stats about Max.pressure.hPa
summary(atlantic_hurricanes17$Max.pressure.hPa)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  882.0   960.0   975.0   968.8   985.0  1007.0       8 

summary stats about Deaths

Code
#summary stats about Deaths
summary(atlantic_hurricanes17$Deaths)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
    0.0     0.0     5.0   224.7    27.0 19325.0      81 

summary stats about Land.areas.affected

Code
# summary stats about total number of distinct land areas affected
atlantic_hurricanes17%>% summarise(count = n_distinct(Land.areas.affected))
Code
# summary of how many times each land area was featured
table(atlantic_hurricanes17$Land.areas.affected)

                           Alabama                           Anguilla 
                                 8                                  2 
                           Antigua                           Antilles 
                                 1                                  2 
                          Arkansas                    Atlantic Canada 
                                 1                                 37 
                            Azores                            Bahamas 
                                23                                 24 
                          Barbados                            Barbuda 
                                 1                                  1 
                            Belize                            Bermuda 
                                12                                 57 
                     British Isles                         Cabo Verde 
                                 4                                  1 
                          Campeche                             Canada 
                                 1                                  5 
       Canadian Maritime Provinces                         Cape Verde 
                                 1                                  8 
                Cape Verde Islands                     Cayman Islands 
                                 1                                  7 
                   Central America                     Central Mexico 
                                26                                  2 
             central United States              Central United States 
                                 1                                  1 
                           Chiapas                           Colombia 
                                 1                                  4 
                        Costa Rica                               Cuba 
                                 3                                 49 
                          Delaware                           Dominica 
                                 1                                  2 
                Dominican Republic    East Coast of the United States 
                                 6                                 16 
                    Eastern Canada Eastern Coast of the United States 
                                 3                                  1 
             Eastern United States                        El Salvador 
                                 6                                  2 
                            Europe                      Faroe Islands 
                                 1                                  1 
                           Florida                  Florida Panhandle 
                                53                                  1 
                            France                            Georgia 
                                 1                                 12 
                     Great Britain                   Greater Antilles 
                                 1                                 13 
                         Greenland                            Grenada 
                                 3                                  1 
                        Guadeloupe                          Guatemala 
                                 4                                  9 
   Gulf Coast of the United States                     Gulf of Mexico 
                                 7                                  3 
                             Haiti                         Hispaniola 
                                 4                                 17 
                          Honduras                  Iberian Peninsula 
                                 8                                  2 
                           Iceland                            Ireland 
                                 4                                  3 
                           Jamaica                   Leeward Antilles 
                                22                                  4 
                   Leeward Islands                    Lesser Antilles 
                                16                                 19 
                         Louisiana                Lucayan Archipelago 
                                19                                  1 
                   Madeira Islands                              Maine 
                                 1                                  1 
                        Martinique                           Maryland 
                                 2                                  3 
                     Massachusetts                             Mexico 
                                 1                                 40 
                      Mid-Atlantic                Mid-Atlantic States 
                                 1                                  8 
          Midwestern United States                        Mississippi 
                                 4                                  9 
                        Montserrat                        New England 
                                 2                                 10 
                        New Jersey                           New York 
                                 3                                  3 
                      Newfoundland          Newfoundland and Labrador 
                                24                                  1 
                         Nicaragua                     North Carolina 
                                 7                                 22 
       North Carolina and Virginia             Northeastern Caribbean 
                                 1                                  1 
        Northeastern United States                    Northern Europe 
                                 3                                  1 
                   northern Mexico                    Northern Mexico 
                                 1                                  1 
                            Norway                        Nova Scotia 
                                 3                                 15 
                          Oklahoma                            Ontario 
                                 2                                  1 
                            Panama                       Pennsylvania 
                                 3                                  3 
                          Portugal                        Puerto Rico 
                                 1                                 28 
                            Quebec                       Quintana Roo 
                                 2                                  2 
                              Saba                       Sable Island 
                                 1                                  2 
                       Saint Croix                        Saint Lucia 
                                 1                                  1 
                      Saint Thomas                           Scotland 
                                 1                                  1 
                    Socorro Island                     South Carolina 
                                 1                                  8 
       South Central United States                      South Florida 
                                 1                                  1 
                       South Texas                   Southeast Mexico 
                                 1                                  1 
        Southeastern United States                  Southern Portugal 
                                 9                                  1 
                    Southern Texas               Southwestern Florida 
                                 1                                  1 
               Southwestern Quebec                 Southwestern Spain 
                                 1                                  1 
                      Soviet Union                              Spain 
                                 1                                  2 
                         St. Lucia                            Tabasco 
                                 1                                  2 
                        Tamaulipas                            Tampico 
                                 4                                  1 
                         Tennessee                              Texas 
                                 3                                 25 
                       The Bahamas                      The Caribbean 
                                31                                 11 
                     The Carolinas                           Trinidad 
                                12                                  1 
               Trinidad and Tobago                   Turks and Caicos 
                                 2                                  1 
          Turks and Caicos Islands                     United Kingdom 
                                 9                                  4 
          United States East coast           United States East Coast 
                                 1                                 32 
          United States Gulf Coast       United States Virgin Islands 
                                18                                  1 
                         Venezuela                           Veracruz 
                                 6                                  4 
                    Virgin Islands                           Virginia 
                                 3                                 11 
                       West Africa                      West Virginia 
                                 1                                  1 
                      western Cuba                     Western Europe 
                                 2                                  2 
                    Western Mexico                   Windward Islands 
                                 2                                 13 
                 Yucatán peninsula                  Yucatán Peninsula 
                                 1                                 23 

summary stats about Max.category

Code
# summary stats about total number of distinct Max categories featured
atlantic_hurricanes17%>% summarize(count = n_distinct(Max.category))
Code
# summary stats about how many times each distinct Max category is featured
table(atlantic_hurricanes17$Max.category)

  1   2   3   5 
489 291 216 117 

summary stats about Name

Code
#summary stats about total number of distinct hurricane names featured
atlantic_hurricanes17%>% summarize(count = n_distinct(Name))
Code
#summary stats about how many times each hurricane name was featured-->there are repeats because of how we pivoted land areas affected earlier but also hurricane names are periodically recycled
table(atlantic_hurricanes17$Name)

                     "Bahamas"                     "Camagüey" 
                             2                              6 
            "Cuba-Brownsville"                         "Cuba" 
                             5                              5 
                   "Labor Day"                  "New England" 
                             5                              2 
     "San Felipe IIOkeechobee"                      "Tampico" 
                             4                              2 
          1928 Haiti hurricane 1932 Florida-Alabama hurricane 
                             3                              1 
 1933 Florida-Mexico hurricane         1935 Jérémie hurricane 
                             3                              4 
            1991 Perfect Storm                           Abby 
                             3                             14 
                          Able                          Agnes 
                            13                             10 
                       Alberto                           Alex 
                             3                              5 
                         Alice                         Alicia 
                             6                              1 
                         Allen                        Allison 
                             4                              6 
                          Alma                         Andrew 
                             9                              3 
                         Anita                           Anna 
                             1                              7 
                        Arlene                         Arthur 
                             5                              3 
                        Audrey                           Babe 
                             2                              8 
                         Baker                        Barbara 
                             6                              2 
                         Barry                          Becky 
                             3                              2 
                         Belle                         Bertha 
                             1                              9 
                          Beta                           Beth 
                             1                              4 
                         Betsy                          Betty 
                             3                              1 
                        Beulah                           Bill 
                             4                              1 
                       Blanche                            Bob 
                             3                              7 
                        Bonnie                         Brenda 
                             8                              3 
                       Camille                        Candice 
                             2                              1 
                         Carol                       Caroline 
                             6                              3 
                         Celia                          Cesar 
                             2                             10 
                       Chantal                        Charley 
                             5                              4 
                       Charlie                          Chloe 
                             5                              3 
                         Chris                          Cindy 
                             3                              9 
                         Clara                      Claudette 
                             2                              7 
                          Cleo                           Cora 
                             4                              4 
                         Daisy                       Danielle 
                             4                              5 
                         Danny                          David 
                            12                              2 
                          Dawn                           Dean 
                             1                              6 
                        Debbie                          Debby 
                             7                              7 
                         Debra                         Dennis 
                             5                             10 
                         Diana                          Diane 
                             1                              5 
                           Dog                          Dolly 
                             4                             14 
                         Doria                         Dorian 
                             1                              5 
                         Doris                        Dorothy 
                             1                              3 
                          Earl                           Easy 
                             4                              3 
                         Edith                           Edna 
                            10                              3 
                       Edouard                          Eight 
                             1                              7 
                      Eighteen                          Elena 
                             5                              3 
                        Eleven                           Ella 
                            13                             11 
                         Ellen                         Eloise 
                             1                              5 
                         Emily                           Emmy 
                            10                              2 
                       Epsilon                          Erika 
                             2                              6 
                          Erin                        Ernesto 
                             7                             10 
             Escuminac (Three)                         Esther 
                             2                              1 
                         Ethel                         Evelyn 
                             2                              3 
                         Faith                           Faye 
                             2                              1 
                         Felix                           Fern 
                             4                              5 
                          Fifi                        Fifteen 
                             5                              3 
                          Five                          Flora 
                            12                              2 
                      Florence                        Flossie 
                            16                              1 
                        Flossy                          Floyd 
                             9                              6 
                          Four                            Fox 
                            22                              5 
                          Fran                      Francelia 
                             6                              1 
                       Frances                           Fred 
                             5                              1 
                        Frieda                      Gabrielle 
                             2                              2 
                          Gail                         Gaston 
                             1                              6 
                        George                        Georges 
                             1                              1 
                         Gerda                           Gert 
                             2                              5 
                      Gertrude                        Gilbert 
                             1                              5 
                        Ginger                          Ginny 
                             2                              6 
                        Gladys                         Gloria 
                             6                              2 
                        Gordon                          Grace 
                            19                              5 
                         Greta                         Gustav 
                             4                              4 
                         Hanna                         Hannah 
                             6                              1 
                        Harvey                         Hattie 
                             1                              1 
                         Hazel                          Heidi 
                             2                              1 
                        Helene                          Henri 
                             4                              2 
                         Hilda                          Holly 
                             2                              2 
                      Hortense                            How 
                             1                              2 
                          Hugo                       Humberto 
                             2                             12 
                           Ida                           Ilsa 
                             4                              1 
                          Inez                           Inga 
                             3                              1 
                         Irene                           Iris 
                            13                              3 
                          Irma                          Isaac 
                             5                              1 
                        Isabel                         Isbell 
                             4                              2 
                       Isidore                           Item 
                             5                              2 
                          Ivan                          Janet 
                             5                              2 
                        Janice                         Jeanne 
                             3                              8 
                         Jenny                          Jerry 
                             2                              3 
                           Jig                           Jose 
                             3                              2 
                     Josephine                          Joyce 
                             2                              3 
                          Juan                         Judith 
                             4                              1 
                          Kara                          Karen 
                             1                              3 
                          Karl                           Kate 
                             4                              3 
                         Katia                          Katie 
                             1                              7 
                       Katrina                         Kendra 
                             7                              2 
                          King                           Kirk 
                             3                              1 
                         Klaus                           Kyle 
                             8                             10 
                         Larry                         Laurie 
                             2                              1 
                           Lee                           Lili 
                             1                             12 
                          Lisa                           Lois 
                             2                              1 
                       Lorenzo                           Love 
                             4                              4 
                         Marco                          Maria 
                             5                              8 
                       Marilyn                         Martha 
                             5                              2 
                       Matthew                        Michael 
                             5                              6 
                         Mitch                           Nana 
                             3                              1 
                          Nate                         Nicole 
                             4                              1 
                          Nine                       Nineteen 
                             4                              1 
                          Noel                           Olga 
                             9                              4 
                           One                        Ophelia 
                            20                              9 
                         Oscar                           Otto 
                             1                              3 
                         Paula                       Paulette 
                             6                              1 
                      Philippe                        Richard 
                             1                              4 
                          Rina                           Rita 
                             2                              2 
                       Roxanne                          Sally 
                             3                              3 
                     San Pedro                          Sandy 
                             2                              3 
                         Seven                      Seventeen 
                            11                              1 
                           Six                           Stan 
                            10                              7 
                         Tanya                            Ten 
                             1                             12 
                      Thirteen                          Three 
                             8                             13 
                         Tomas                         Twelve 
                             4                              2 
                           Two                        Unnamed 
                            21                             35 
                         Vince                          Wilma 
                             3                              3 
                          Zeta 
                             8 

summary stats about Damage.$.units

Code
# summary stats number of distinct Damage.$.units
atlantic_hurricanes17%>% summarise(count = n_distinct(`Damage.$.unit`))
Code
# summary stats of each Damage.$.units featured-->should show that most of the hurricanes cost in the millions range
table(atlantic_hurricanes17$`Damage.$.unit`)

 billion  million thousand 
      97      351       95 

Note: Fully Tidying both the Duration column and the Damage.$.amount column is still in progress! *this and is relevant summary stats and visualizations incorporating them will be fully updated by the final project due date

Data Visualizations

making temporary modifications to duration column and pairing down the number of years from 1920-2020 down to 2000-2020 for visualization purposes in this draft

Code
# Modifying the Duration column into Date and Year
atlantic_hurricanes18 <- atlantic_hurricanes17 %>%
  separate(Duration, c("Date", "Year"), ", ")
atlantic_hurricanes18
Code
# Editing down to only focus on years 2000-2020
atlantic_hurricanes19 <- atlantic_hurricanes18[atlantic_hurricanes18$Year >= "2000" & atlantic_hurricanes18$Year <= "2020",]
atlantic_hurricanes19

Connected Scatterplot of Correlation between Max.wind.speed.mph and Max.pressure.hPa

Code
# Scatterplot of Correlation between Max.wind.speed.mph and Max.pressure.hPa
ggplot(atlantic_hurricanes19, aes(x= Max.wind.speed.mph, y= Max.pressure.hPa))+
  geom_point(size=3, fill="black", color="black")+
  geom_line(color= "red", size= 1)+
  labs(title = "Connected Scatterplot of Correlation between Max.wind.speed.mph & Max.pressure.hPa", subtitle = "2000-2020", x= " Max Wind Speed (mph)", y="Max Pressure (hPa)")+
  theme(axis.text.x = element_text(angle = 30, size = 2))+
  facet_grid()+
  theme_minimal()

For the most part, it seems as though as Wind Speed (mph) increases, Pressure (hPa) decreases–meaning that more severe hurricanes (of a higher category) have lower pressure (hPa) measures.

Bar Plot of Max Category by Max.wind.speed.mph sorted by Hurricane Name, 2018-2020

Code
#pair down to storms between 2018-2020

atlantic_hurricanes20 <- atlantic_hurricanes19[atlantic_hurricanes19$Year >= "2018" & atlantic_hurricanes19$Year <= "2020",]
atlantic_hurricanes20
Code
# bar plot of Name by Max.wind.speed.mph
ggplot(atlantic_hurricanes20, aes(x= Max.category, y= Max.wind.speed.mph, fill= Name))+
  geom_bar(stat = "identity", position= position_dodge(width = .6), width = 0.4)+
  labs(title = "Bar Plot of Max Category by Max Wind Speed sorted by Hurricane Name", subtitle= "2018-2020, **Category 4 data not included in this dataset", x= "Max Category (no storms w/ max cat. 1 in these years)", y="Max Wind Speed (mph)", fill= "Hurricane Name")+
  facet_grid()+
  theme_light()

Looks like half of the hurricanes (6 out of 12) from 2018-2020 were Category 2 storms with max wind speeds hovering slightly over 100 mph.

Bar Plot of Hurricane Frequency over the Years, 2000-2020 sorted by Damage.$.unit

Code
# Bar Plot of Hurricane Count over the Years, 2000-2020, sorted by Damage.$.unit
ggplot(data = atlantic_hurricanes19, aes(x= Year, fill= `Damage.$.unit`))+
  geom_bar(width= .2, position= position_dodge(width = .4))+
  facet_grid()+
  theme(axis.text.x = element_text(angle = 30, size = 7))+
  labs(title= "Bar Plot of Hurricane Count by Year sorted by unit of Damage Cost ($USD)", subtitle= "2000-2020", x= "Year", y= "Count", fill= "Damage.$.unit")

Overall, notwithstanding the damage $ unit values that are NA or non-available, it seems like the damage from hurricanes across 2000-2020 is becoming more expensive–even in more recent years where it may seem like there’s slightly less hurricanes per year (compared to a year like 2005), more hurricanes are more consistently causing damage in the billions of dollars…the last recorded hurricanes where the damage cost stayed within the thousands unit was in the year 2006–over 15 years ago.

Potential Research Questions This Dataset Could Help Answer

Do duration, wind speed, pressure, and hurricane category classification have any correlation with deaths and damage costs caused by hurricanes? If so, how so? Which areas are most susceptible to the most severe and devastating hurricanes in the century this dataset covers? On the other hand, there are some limitations to this dataset. Something I noticed this dataset doesn’t answer is, at what point in the duration of the hurricane were these measures of wind speed and pressure taken? While I’m assuming it is a listing of the highest intensity recorded wind speed and pressure throughout the course of the storm, this isn’t clear. Moreover, for the hurricanes that impacted multiple areas, the windspeed and pressure was likely not the exact same measure in each successive location the hurricane made landfall in. While insights can be deduced from this dataset, especially after additional, forthcoming stages of cleaning the data, studying each storm and each area individually will provide additional, more full and nuanced context(s) of systems, resources, lived experiences, etc that influenced and/or were influenced by the hurricane(s).

References

Liamtsau, V. (2022, August 5). Atlantic hurricanes (data cleaning challenge). Kaggle. https://www.kaggle.com/datasets/valery2042/hurricanes