Challenge 4

challenge_4

abc_poll

eggs

fed_rates

hotel_bookings

debt

Author

Matthew O’Neill

Published

August 18, 2022

Code

library(tidyverse)
library(lubridate)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Challenge Overview

Today’s challenge is to:

read in a data set, and describe the data set using both words and any supporting information (e.g., tables, etc)
tidy data (as needed, including sanity checks)
identify variables that need to be mutated
mutate variables and sanity check all mutations

Read in data

Code

data = read_csv("../posts/_data/FedFundsRate.csv")
data

# A tibble: 904 × 10
    Year Month   Day Federal F…¹ Feder…² Feder…³ Effec…⁴ Real …⁵ Unemp…⁶ Infla…⁷
   <dbl> <dbl> <dbl>       <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
 1  1954     7     1          NA      NA      NA    0.8      4.6     5.8      NA
 2  1954     8     1          NA      NA      NA    1.22    NA       6        NA
 3  1954     9     1          NA      NA      NA    1.06    NA       6.1      NA
 4  1954    10     1          NA      NA      NA    0.85     8       5.7      NA
 5  1954    11     1          NA      NA      NA    0.83    NA       5.3      NA
 6  1954    12     1          NA      NA      NA    1.28    NA       5        NA
 7  1955     1     1          NA      NA      NA    1.39    11.9     4.9      NA
 8  1955     2     1          NA      NA      NA    1.29    NA       4.7      NA
 9  1955     3     1          NA      NA      NA    1.35    NA       4.6      NA
10  1955     4     1          NA      NA      NA    1.43     6.7     4.7      NA
# … with 894 more rows, and abbreviated variable names
#   ¹`Federal Funds Target Rate`, ²`Federal Funds Upper Target`,
#   ³`Federal Funds Lower Target`, ⁴`Effective Federal Funds Rate`,
#   ⁵`Real GDP (Percent Change)`, ⁶`Unemployment Rate`, ⁷`Inflation Rate`

Briefly describe the data

The data describes monthly metrics related to the US economy from 1954 to 2017. The data includes the effective Federal interest rate, the change in GDP from the previous quarter, the unemployment rate for that month, and the year over year inflation rate.

Tidy Data

Code

sum(is.na(data$"Federal Funds Target Rate"))

[1] 442

There are 442 rows out of 904 which contain N/A values for the Target Rate. This is because the target rate was not a piece of data that was recorded until 1982 and, in 2012, the Target Rate was replaced with a target upper and lower bound for the federal funds rate to fall between.

If we were interested in working with the target rate, we could find the midpoint of the lower and upper bound of the target rate and treat that as an estimated target rate.

For now though, the data is tidy enough to work with and there are no unnecessary rows or columns to completely get rid of.

Identify variables that need to be mutated

The dates do however need to be mutated, as we currently have columns for month, day, and year. This is redudant, and we can reformat using the str_c() function. We do need to specify day as now all rows fall on the first day of the month.

Code

data<-data%>%
  mutate(date = str_c(`Year`,`Month`, `Day`, sep="/"),date = ymd(date))

data <- data[-c(1,2,3)]
data <- select(data, date, everything())
data

# A tibble: 904 × 8
   date       Federal Funds Ta…¹ Feder…² Feder…³ Effec…⁴ Real …⁵ Unemp…⁶ Infla…⁷
   <date>                  <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
 1 1954-07-01                 NA      NA      NA    0.8      4.6     5.8      NA
 2 1954-08-01                 NA      NA      NA    1.22    NA       6        NA
 3 1954-09-01                 NA      NA      NA    1.06    NA       6.1      NA
 4 1954-10-01                 NA      NA      NA    0.85     8       5.7      NA
 5 1954-11-01                 NA      NA      NA    0.83    NA       5.3      NA
 6 1954-12-01                 NA      NA      NA    1.28    NA       5        NA
 7 1955-01-01                 NA      NA      NA    1.39    11.9     4.9      NA
 8 1955-02-01                 NA      NA      NA    1.29    NA       4.7      NA
 9 1955-03-01                 NA      NA      NA    1.35    NA       4.6      NA
10 1955-04-01                 NA      NA      NA    1.43     6.7     4.7      NA
# … with 894 more rows, and abbreviated variable names
#   ¹`Federal Funds Target Rate`, ²`Federal Funds Upper Target`,
#   ³`Federal Funds Lower Target`, ⁴`Effective Federal Funds Rate`,
#   ⁵`Real GDP (Percent Change)`, ⁶`Unemployment Rate`, ⁷`Inflation Rate`

We mutated the dateset to include a date column, moved that date coulmn to the front, and removed the now redundant year, month, and day columns.

Additional Comments

The rest of the data is formatted well and there doesn’t appear to be any additional redudancies we can get rid of. If we were working with GDP we might consider reducing the data to be quarterly, but all other metrics are recorded on a monthly basis, so it doesn’t make sense to do it preemptively.