Challenge 4

challenge_4

FedFundsRate

More data wrangling: pivoting

Author

Mekhala Kumar

Published

August 18, 2022

Code

library(tidyverse)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Reading in data

The dataset being used is FedFundsRate.

Code

library(readr)
FedFundsRate <- read_csv("_data/FedFundsRate.csv")
str(FedFundsRate)

spec_tbl_df [904 × 10] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ Year                        : num [1:904] 1954 1954 1954 1954 1954 ...
 $ Month                       : num [1:904] 7 8 9 10 11 12 1 2 3 4 ...
 $ Day                         : num [1:904] 1 1 1 1 1 1 1 1 1 1 ...
 $ Federal Funds Target Rate   : num [1:904] NA NA NA NA NA NA NA NA NA NA ...
 $ Federal Funds Upper Target  : num [1:904] NA NA NA NA NA NA NA NA NA NA ...
 $ Federal Funds Lower Target  : num [1:904] NA NA NA NA NA NA NA NA NA NA ...
 $ Effective Federal Funds Rate: num [1:904] 0.8 1.22 1.06 0.85 0.83 1.28 1.39 1.29 1.35 1.43 ...
 $ Real GDP (Percent Change)   : num [1:904] 4.6 NA NA 8 NA NA 11.9 NA NA 6.7 ...
 $ Unemployment Rate           : num [1:904] 5.8 6 6.1 5.7 5.3 5 4.9 4.7 4.6 4.7 ...
 $ Inflation Rate              : num [1:904] NA NA NA NA NA NA NA NA NA NA ...
 - attr(*, "spec")=
  .. cols(
  ..   Year = col_double(),
  ..   Month = col_double(),
  ..   Day = col_double(),
  ..   `Federal Funds Target Rate` = col_double(),
  ..   `Federal Funds Upper Target` = col_double(),
  ..   `Federal Funds Lower Target` = col_double(),
  ..   `Effective Federal Funds Rate` = col_double(),
  ..   `Real GDP (Percent Change)` = col_double(),
  ..   `Unemployment Rate` = col_double(),
  ..   `Inflation Rate` = col_double()
  .. )
 - attr(*, "problems")=<externalptr>

Code

dim(FedFundsRate)

[1] 904  10

Code

library(skimr)
skim(FedFundsRate)

Data summary
Name	FedFundsRate
Number of rows	904
Number of columns	10
_______________________
Column type frequency:
numeric	10
________________________
Group variables	None

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
Year	0	1.00	1986.68	17.17	1954.00	1973.00	1987.50	2001.00	2017.00	▅▆▇▇▆
Month	0	1.00	6.60	3.47	1.00	4.00	7.00	10.00	12.00	▇▅▅▅▇
Day	0	1.00	3.60	6.79	1.00	1.00	1.00	1.00	31.00	▇▁▁▁▁
Federal Funds Target Rate	442	0.51	5.66	2.55	1.00	3.75	5.50	7.75	11.50	▅▅▇▅▂
Federal Funds Upper Target	801	0.11	0.31	0.14	0.25	0.25	0.25	0.25	1.00	▇▁▁▁▁
Federal Funds Lower Target	801	0.11	0.06	0.14	0.00	0.00	0.00	0.00	0.75	▇▁▁▁▁
Effective Federal Funds Rate	152	0.83	4.91	3.61	0.07	2.43	4.70	6.58	19.10	▇▇▃▁▁
Real GDP (Percent Change)	654	0.28	3.14	3.60	-10.00	1.40	3.10	4.88	16.50	▁▂▇▂▁
Unemployment Rate	152	0.83	5.98	1.57	3.40	4.90	5.70	7.00	10.80	▅▇▅▁▁
Inflation Rate	194	0.79	3.73	2.57	0.60	2.00	2.80	4.70	13.60	▇▃▁▁▁

Code

head(FedFundsRate)

# A tibble: 6 × 10
   Year Month   Day Federal Fu…¹ Feder…² Feder…³ Effec…⁴ Real …⁵ Unemp…⁶ Infla…⁷
  <dbl> <dbl> <dbl>        <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
1  1954     7     1           NA      NA      NA    0.8      4.6     5.8      NA
2  1954     8     1           NA      NA      NA    1.22    NA       6        NA
3  1954     9     1           NA      NA      NA    1.06    NA       6.1      NA
4  1954    10     1           NA      NA      NA    0.85     8       5.7      NA
5  1954    11     1           NA      NA      NA    0.83    NA       5.3      NA
6  1954    12     1           NA      NA      NA    1.28    NA       5        NA
# … with abbreviated variable names ¹`Federal Funds Target Rate`,
#   ²`Federal Funds Upper Target`, ³`Federal Funds Lower Target`,
#   ⁴`Effective Federal Funds Rate`, ⁵`Real GDP (Percent Change)`,
#   ⁶`Unemployment Rate`, ⁷`Inflation Rate`

Description of the data

The data consists of the target interest rates as well as details of the GDP, unemployment rate and inflation rate between 1954-2017. It has 10 variables and 904 observations. However, a few variables such as the Federal Funds Upper Target, Federal Funds Lower Target and Real GDP, have more than half of the total observations missing.

Tidy Data

The data is not already tidy. The observations are present as columns and values are in multiple columns. Moreover, the date is present in three columns and the three need to be recoded into one variable.

There are 904 rows(n), 10 variables(k) and 7 variables will be used to identify a case. However, since the date variables will be consolidated into one variable, we can consider there to be 8 variables while conducting the sanity check. The expected number of rows are 6328 and expected number of columns are 3.

Code

rows=(904)*(8-1)
print("Expected number of rows:")

[1] "Expected number of rows:"

Code

rows

[1] 6328

Code

col=(8-7)+2
print("Expected number of columns:")

[1] "Expected number of columns:"

Code

col

[1] 3

Identification of variables that need to be mutated

The time variables are not correctly coded as dates, hence a new column with the proper date format was created. Moreover, the remaining columns with the data about the federal funds target rates, unemployment rates, inflation rates and GDP were pivoted to make a taller dataset. All of the remaining columns were added as one case since they represented different types of rates and their values were stored in a separate column. It was found that the dimensions are 6328*3 which was the same found during the sanity check. A glimpse of the rearranged dataset has been provided.

Code

library(lubridate)
library(stringr)
library(tidyverse)
FedFundsRate$Date <- str_c(FedFundsRate$Year,"-",FedFundsRate$Month,"-",FedFundsRate$Day)%>%ymd()%>%as.Date()
FedFundsRate=subset(FedFundsRate,select=-c(1,2,3))
FedFundsRate<-FedFundsRate%>%select(Date,everything())
FedFundsRate<-pivot_longer(FedFundsRate, 2:8, names_to = "Rates", values_to = "Value")
dim(FedFundsRate)

[1] 6328    3

Code

head(FedFundsRate)

# A tibble: 6 × 3
  Date       Rates                        Value
  <date>     <chr>                        <dbl>
1 1954-07-01 Federal Funds Target Rate     NA  
2 1954-07-01 Federal Funds Upper Target    NA  
3 1954-07-01 Federal Funds Lower Target    NA  
4 1954-07-01 Effective Federal Funds Rate   0.8
5 1954-07-01 Real GDP (Percent Change)      4.6
6 1954-07-01 Unemployment Rate              5.8