Challenge 4 - FedFundsRate Dataset

challenge_4

fed_rates

Megan Galarneau

More data wrangling: pivoting

Author

Megan Galarneau

Published

March 22, 2023

Code

library(tidyverse)
library(dplyr)
library(lubridate)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Challenge Overview

Today’s challenge is to:

Read in a data set, and describe the data set using both words and any supporting information (e.g., tables, etc)
Tidy data (as needed, including sanity checks)
Identify variables that need to be mutated
Mutate variables and sanity check all mutations

Read in data

Code

#simply read in the data (untouched)
library(readr)
raw_FedFundsRate <- read_csv("_data/FedFundsRate.csv")
raw_FedFundsRate

Briefly describe the data

This data set is reporting on federal fund rates as it relates to the real GDP (percent change), unemployment rate, and inflation rate from July 1954 to March 2017. According to Investopedia, the federal funds rate is “an interest rate set by the Federal Reserve to indirectly manage interest rates, inflation, and unemployment”. In this data set, we examine four types: target, upper target, lower target, and effective rate.

Code

#summary of data set statistics
print(summarytools::dfSummary(raw_FedFundsRate,
                        varnumbers = FALSE,
                        plain.ascii  = FALSE, 
                        style        = "grid", 
                        graph.magnif = 0.70, 
                        valid.col    = FALSE),
      method = 'render',
      table.classes = 'table-condensed')

Data Frame Summary

raw_FedFundsRate

Dimensions: 904 x 10
Duplicates: 0

Year [numeric]

Mean (sd) : 1986.7 (17.2)
min ≤ med ≤ max:
1954 ≤ 1987.5 ≤ 2017
IQR (CV) : 28 (0)

64 distinct values

0 (0.0%)

Month [numeric]

Mean (sd) : 6.6 (3.5)
min ≤ med ≤ max:
1 ≤ 7 ≤ 12
IQR (CV) : 6 (0.5)

12 distinct values

0 (0.0%)

Day [numeric]

Mean (sd) : 3.6 (6.8)
min ≤ med ≤ max:
1 ≤ 1 ≤ 31
IQR (CV) : 0 (1.9)

29 distinct values

0 (0.0%)

Federal Funds Target Rate [numeric]

Mean (sd) : 5.7 (2.6)
min ≤ med ≤ max:
1 ≤ 5.5 ≤ 11.5
IQR (CV) : 4 (0.5)

63 distinct values

442 (48.9%)

Federal Funds Upper Target [numeric]

Mean (sd) : 0.3 (0.1)
min ≤ med ≤ max:
0.2 ≤ 0.2 ≤ 1
IQR (CV) : 0 (0.5)

4 distinct values

801 (88.6%)

Federal Funds Lower Target [numeric]

Mean (sd) : 0.1 (0.1)
min ≤ med ≤ max:
0 ≤ 0 ≤ 0.8
IQR (CV) : 0 (2.4)

4 distinct values

801 (88.6%)

Effective Federal Funds Rate [numeric]

Mean (sd) : 4.9 (3.6)
min ≤ med ≤ max:
0.1 ≤ 4.7 ≤ 19.1
IQR (CV) : 4.2 (0.7)

466 distinct values

152 (16.8%)

Real GDP (Percent Change) [numeric]

Mean (sd) : 3.1 (3.6)
min ≤ med ≤ max:
-10 ≤ 3.1 ≤ 16.5
IQR (CV) : 3.5 (1.1)

113 distinct values

654 (72.3%)

Unemployment Rate [numeric]

Mean (sd) : 6 (1.6)
min ≤ med ≤ max:
3.4 ≤ 5.7 ≤ 10.8
IQR (CV) : 2.1 (0.3)

71 distinct values

152 (16.8%)

Inflation Rate [numeric]

Mean (sd) : 3.7 (2.6)
min ≤ med ≤ max:
0.6 ≤ 2.8 ≤ 13.6
IQR (CV) : 2.7 (0.7)

106 distinct values

194 (21.5%)

Generated by summarytools 1.0.1 (R version 4.2.2)
2023-03-22

Tidy Data

Noticeably, there are many missing values across the federal fund rates and macroeconomic columns. Perhaps this data was collected on a specific cadence.

To make this data set more comprehensive, I pivoted longer to have a column for federal fund types and macroeconomic conditions along with the respective values. I also programmed the tables to only show me rows with values so the reader is not sifting through missing data. For example: on July 1st, 1954, the effective federal funds rate was 0.8000, the real GDP was 4.6 and the unemployment rate was 5.8.

As a note, if I were to visualize this data as a ggplot, I would condense the federal fund rate and macroeconomic conditions as two columns, type and value. This way I could see all seven rates over time.

Code

#pivot longer of the federal fund rate columns, removed NA values
tidy_FedFundsRate <-pivot_longer(raw_FedFundsRate, col = c("Federal Funds Target Rate", "Federal Funds Upper Target", "Federal Funds Lower Target", "Effective Federal Funds Rate"),
                 names_to="Federal Fund Type",
                 values_to = "Federal Fund Rate",
                 values_drop_na = TRUE)

Code

#pivot longer of the federal fund rate columns, removed NA values. For this section, I'd rather not have pivoted longer as it creates duplicate federal fund rate rows, but I didn't want all the NA values regardless.
tidy_FedFundsRate <-pivot_longer(tidy_FedFundsRate, col = c("Real GDP (Percent Change)", "Unemployment Rate", "Inflation Rate"),
                 names_to="Macroeconomic Condition",
                 values_to = "Macroeconomic Rate",
                 values_drop_na = TRUE)

tidy_FedFundsRate

Identify & mutate variables

There are three variables that require mutation: year, month, and day. These variables can be mutated into one easily readable “date” column. Upon creating this new column, I removed the old variables and moved it to first position.

This data set of federal fund rates related to macroeconomic conditions from 1954-2017 is now compiled into a comprehensive table for analysis.

Code

tidy_FedFundsRate<-tidy_FedFundsRate%>%
  mutate(date = str_c(Year, Month, Day, sep="-"),
         date = ymd(date)) %>%
  select(-c(Year, Month, Day)) %>%
  relocate(date)

tidy_FedFundsRate