Code
library(tidyverse)
library(lubridate)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Jack Sniezek
December 5, 2022
Today’s challenge is to:
# A tibble: 904 × 10
Year Month Day Federal F…¹ Feder…² Feder…³ Effec…⁴ Real …⁵ Unemp…⁶ Infla…⁷
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1954 7 1 NA NA NA 0.8 4.6 5.8 NA
2 1954 8 1 NA NA NA 1.22 NA 6 NA
3 1954 9 1 NA NA NA 1.06 NA 6.1 NA
4 1954 10 1 NA NA NA 0.85 8 5.7 NA
5 1954 11 1 NA NA NA 0.83 NA 5.3 NA
6 1954 12 1 NA NA NA 1.28 NA 5 NA
7 1955 1 1 NA NA NA 1.39 11.9 4.9 NA
8 1955 2 1 NA NA NA 1.29 NA 4.7 NA
9 1955 3 1 NA NA NA 1.35 NA 4.6 NA
10 1955 4 1 NA NA NA 1.43 6.7 4.7 NA
# … with 894 more rows, and abbreviated variable names
# ¹`Federal Funds Target Rate`, ²`Federal Funds Upper Target`,
# ³`Federal Funds Lower Target`, ⁴`Effective Federal Funds Rate`,
# ⁵`Real GDP (Percent Change)`, ⁶`Unemployment Rate`, ⁷`Inflation Rate`
Year Month Day Federal Funds Target Rate
Min. :1954 Min. : 1.000 Min. : 1.000 Min. : 1.000
1st Qu.:1973 1st Qu.: 4.000 1st Qu.: 1.000 1st Qu.: 3.750
Median :1988 Median : 7.000 Median : 1.000 Median : 5.500
Mean :1987 Mean : 6.598 Mean : 3.598 Mean : 5.658
3rd Qu.:2001 3rd Qu.:10.000 3rd Qu.: 1.000 3rd Qu.: 7.750
Max. :2017 Max. :12.000 Max. :31.000 Max. :11.500
NA's :442
Federal Funds Upper Target Federal Funds Lower Target
Min. :0.2500 Min. :0.0000
1st Qu.:0.2500 1st Qu.:0.0000
Median :0.2500 Median :0.0000
Mean :0.3083 Mean :0.0583
3rd Qu.:0.2500 3rd Qu.:0.0000
Max. :1.0000 Max. :0.7500
NA's :801 NA's :801
Effective Federal Funds Rate Real GDP (Percent Change) Unemployment Rate
Min. : 0.070 Min. :-10.000 Min. : 3.400
1st Qu.: 2.428 1st Qu.: 1.400 1st Qu.: 4.900
Median : 4.700 Median : 3.100 Median : 5.700
Mean : 4.911 Mean : 3.138 Mean : 5.979
3rd Qu.: 6.580 3rd Qu.: 4.875 3rd Qu.: 7.000
Max. :19.100 Max. : 16.500 Max. :10.800
NA's :152 NA's :654 NA's :152
Inflation Rate
Min. : 0.600
1st Qu.: 2.000
Median : 2.800
Mean : 3.733
3rd Qu.: 4.700
Max. :13.600
NA's :194
The Federal Funds Rate dataset contains columns for year, month, and day, as well as 4 federal funds rate columns, GDP, unemployment rate, and the inflation rates collected from 1954 into 2017. There is a lot of missing data, but I noticed that there was a reason for a lot of it. GDP was collected quarterly, so the same 4 months each year contained GDP data while the rest were empty. The target federal funds rate was replaced by the upper and lower target rates beginning in 2009. Inflation was not collected until 1958. Target federal funds rate wasn’t collected until the end of 1982. Lastly, any date that did not correspond to the first of each month did not have data for the effective federal funds rate, GDP, inflation rate, or unemployment rate.
My plan is to try to filter out the dates that do not correspond to the first of the month, as those dates only have data for the target federal funds rate and nothing else.
# A tibble: 753 × 10
Year Month Day Federal F…¹ Feder…² Feder…³ Effec…⁴ Real …⁵ Unemp…⁶ Infla…⁷
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1954 7 1 NA NA NA 0.8 4.6 5.8 NA
2 1954 8 1 NA NA NA 1.22 NA 6 NA
3 1954 9 1 NA NA NA 1.06 NA 6.1 NA
4 1954 10 1 NA NA NA 0.85 8 5.7 NA
5 1954 11 1 NA NA NA 0.83 NA 5.3 NA
6 1954 12 1 NA NA NA 1.28 NA 5 NA
7 1955 1 1 NA NA NA 1.39 11.9 4.9 NA
8 1955 2 1 NA NA NA 1.29 NA 4.7 NA
9 1955 3 1 NA NA NA 1.35 NA 4.6 NA
10 1955 4 1 NA NA NA 1.43 6.7 4.7 NA
# … with 743 more rows, and abbreviated variable names
# ¹`Federal Funds Target Rate`, ²`Federal Funds Upper Target`,
# ³`Federal Funds Lower Target`, ⁴`Effective Federal Funds Rate`,
# ⁵`Real GDP (Percent Change)`, ⁶`Unemployment Rate`, ⁷`Inflation Rate`
I will be mutating the date variables into one variable. This will make it easier to visualize in a graph or table. I will also mutate the upper and lower target federal funds rates to fill in the rest of the target federal funds rate column, which I will be able to use instead of having three separate target rates. This should leave me with 6 columns, with one being a date column and the other five being different rates.
fed_rates_new <- fed_rates_clean%>%
mutate(Date = make_date(Year, Month, Day), .before = `Federal Funds Target Rate`)
fed_rates_new <- fed_rates_new%>%
mutate(`Federal Funds Target Rate` = ifelse(is.na(`Federal Funds Target Rate`), (`Federal Funds Upper Target`+ `Federal Funds Lower Target`)/2, `Federal Funds Target Rate`))
fed_rates_new <- select(fed_rates_new, -c("Year", "Month", "Day", contains("Upper"), contains("Lower")))
fed_rates_new
# A tibble: 753 × 6
Date `Federal Funds Target Rate` Effective Fe…¹ Real …² Unemp…³ Infla…⁴
<date> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1954-07-01 NA 0.8 4.6 5.8 NA
2 1954-08-01 NA 1.22 NA 6 NA
3 1954-09-01 NA 1.06 NA 6.1 NA
4 1954-10-01 NA 0.85 8 5.7 NA
5 1954-11-01 NA 0.83 NA 5.3 NA
6 1954-12-01 NA 1.28 NA 5 NA
7 1955-01-01 NA 1.39 11.9 4.9 NA
8 1955-02-01 NA 1.29 NA 4.7 NA
9 1955-03-01 NA 1.35 NA 4.6 NA
10 1955-04-01 NA 1.43 6.7 4.7 NA
# … with 743 more rows, and abbreviated variable names
# ¹`Effective Federal Funds Rate`, ²`Real GDP (Percent Change)`,
# ³`Unemployment Rate`, ⁴`Inflation Rate`
The data matches what I was trying to accomplish.
---
title: "Challenge 4"
author: "Jack Sniezek"
desription: "More data wrangling: pivoting"
date: "12/05/2022"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_4
- abc_poll
- eggs
- fed_rates
- hotel_bookings
- debt
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
library(lubridate)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
## Challenge Overview
Today's challenge is to:
1) read in a data set, and describe the data set using both words and any supporting information (e.g., tables, etc)
2) tidy data (as needed, including sanity checks)
3) identify variables that need to be mutated
4) mutate variables and sanity check all mutations
## Read in data
- FedFundsRate.csv⭐⭐⭐
```{r}
fed_rates_orig <- read_csv("_data/FedFundsRate.csv")
fed_rates_orig
summary(fed_rates_orig)
```
## Briefly describe the data
The Federal Funds Rate dataset contains columns for year, month, and day, as well as 4 federal funds rate columns, GDP, unemployment rate, and the inflation rates collected from 1954 into 2017. There is a lot of missing data, but I noticed that there was a reason for a lot of it. GDP was collected quarterly, so the same 4 months each year contained GDP data while the rest were empty. The target federal funds rate was replaced by the upper and lower target rates beginning in 2009. Inflation was not collected until 1958. Target federal funds rate wasn't collected until the end of 1982. Lastly, any date that did not correspond to the first of each month did not have data for the effective federal funds rate, GDP, inflation rate, or unemployment rate.
## Tidy Data (as needed)
My plan is to try to filter out the dates that do not correspond to the first of the month, as those dates only have data for the target federal funds rate and nothing else.
```{r}
fed_rates_clean <- filter(fed_rates_orig, `Day` == 1)
fed_rates_clean
```
## Identify variables that need to be mutated
I will be mutating the date variables into one variable. This will make it easier to visualize in a graph or table. I will also mutate the upper and lower target federal funds rates to fill in the rest of the target federal funds rate column, which I will be able to use instead of having three separate target rates. This should leave me with 6 columns, with one being a date column and the other five being different rates.
```{r}
fed_rates_new <- fed_rates_clean%>%
mutate(Date = make_date(Year, Month, Day), .before = `Federal Funds Target Rate`)
fed_rates_new <- fed_rates_new%>%
mutate(`Federal Funds Target Rate` = ifelse(is.na(`Federal Funds Target Rate`), (`Federal Funds Upper Target`+ `Federal Funds Lower Target`)/2, `Federal Funds Target Rate`))
fed_rates_new <- select(fed_rates_new, -c("Year", "Month", "Day", contains("Upper"), contains("Lower")))
fed_rates_new
```
The data matches what I was trying to accomplish.