library(tidyverse)
library(readxl)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Challenge 8 Solutions
Challenge Overview
Today’s challenge is to:
- read in multiple data sets, and describe the data set using both words and any supporting information (e.g., tables, etc)
- tidy data (as needed, including sanity checks)
- mutate variables as needed (including sanity checks)
- join two or more data sets and analyze some aspect of the joined data
(be sure to only include the category tags for the data you use!)
Read in data
setwd("C:/Users/chion/OneDrive/Desktop/DACSS 601/DACSS_601_Summer2023_Sec1/posts")
library(readxl)
<- read_excel("_data/debt_in_trillions.xlsx")
debt_in_trillions
library(readr)
<- read_csv("_data/FedFundsRate.csv") FedFundsRate
Briefly describe the data
summary(FedFundsRate)
Year Month Day Federal Funds Target Rate
Min. :1954 Min. : 1.000 Min. : 1.000 Min. : 1.000
1st Qu.:1973 1st Qu.: 4.000 1st Qu.: 1.000 1st Qu.: 3.750
Median :1988 Median : 7.000 Median : 1.000 Median : 5.500
Mean :1987 Mean : 6.598 Mean : 3.598 Mean : 5.658
3rd Qu.:2001 3rd Qu.:10.000 3rd Qu.: 1.000 3rd Qu.: 7.750
Max. :2017 Max. :12.000 Max. :31.000 Max. :11.500
NA's :442
Federal Funds Upper Target Federal Funds Lower Target
Min. :0.2500 Min. :0.0000
1st Qu.:0.2500 1st Qu.:0.0000
Median :0.2500 Median :0.0000
Mean :0.3083 Mean :0.0583
3rd Qu.:0.2500 3rd Qu.:0.0000
Max. :1.0000 Max. :0.7500
NA's :801 NA's :801
Effective Federal Funds Rate Real GDP (Percent Change) Unemployment Rate
Min. : 0.070 Min. :-10.000 Min. : 3.400
1st Qu.: 2.428 1st Qu.: 1.400 1st Qu.: 4.900
Median : 4.700 Median : 3.100 Median : 5.700
Mean : 4.911 Mean : 3.138 Mean : 5.979
3rd Qu.: 6.580 3rd Qu.: 4.875 3rd Qu.: 7.000
Max. :19.100 Max. : 16.500 Max. :10.800
NA's :152 NA's :654 NA's :152
Inflation Rate
Min. : 0.600
1st Qu.: 2.000
Median : 2.800
Mean : 3.733
3rd Qu.: 4.700
Max. :13.600
NA's :194
summary(debt_in_trillions)
Year and Quarter Mortgage HE Revolving Auto Loan
Length:74 Min. : 4.942 Min. :0.2420 Min. :0.6220
Class :character 1st Qu.: 8.036 1st Qu.:0.4275 1st Qu.:0.7430
Mode :character Median : 8.412 Median :0.5165 Median :0.8145
Mean : 8.274 Mean :0.5161 Mean :0.9309
3rd Qu.: 9.047 3rd Qu.:0.6172 3rd Qu.:1.1515
Max. :10.442 Max. :0.7140 Max. :1.4150
Credit Card Student Loan Other Total
Min. :0.6590 Min. :0.2407 Min. :0.2960 Min. : 7.231
1st Qu.:0.6966 1st Qu.:0.5333 1st Qu.:0.3414 1st Qu.:11.311
Median :0.7375 Median :0.9088 Median :0.3921 Median :11.852
Mean :0.7565 Mean :0.9189 Mean :0.3831 Mean :11.779
3rd Qu.:0.8165 3rd Qu.:1.3022 3rd Qu.:0.4154 3rd Qu.:12.674
Max. :0.9270 Max. :1.5840 Max. :0.4860 Max. :14.957
##Description of data
There are 904 observations with 8 variables in the federal funds rate data set. The data covers the years 1954-2017 and pertains to federal funds rates. The federal funds target rate is recorded monthly for 1954-2008, while the upper and lower targets cover 2008-2017. There is also macroeconomic data for real GDP (percent change), unemployment and inflation. The real GDP data is recorded for the first and forth quarters, while the unemployment and inflation data is recorded quarterly. The mean of the federal funds target rate is higher than than the median at 5.66 and 5.5 respectively, likely indicating a skewed right distribution. The federal funds rate ranges from 1 to 11.5. In addition, as expected, the federal funds upper target rate and lower target rate also indicate a skewed right distribution, with a mean of .3 and median of .25 for upper targets and mean of .06 and median of 0 for lower targets. In fact, all of the macroeconomic data is also skewed right, as expected, with means higher than the medians (See table below).
There are 74 observations and 8 variables recorded quarterly in the debt in trillions data set. The debt data does indicate as uniform distribution patterns, with the variables showing both skewed left (Mortgage, HE, Other, Total) and skewed right distributions(Auto Loan, Credit Card, Student Loan (see table below).
To join the two data sets, I needed to convert the existing dates to calendar dates to use them as the joining variable.
Tidy Data (as needed)
<- debt_in_trillions
debt
#Convert quarters to dates
<- debt %>%
debt mutate(Date=parse_date_time(`Year and Quarter`, orders="yq"))
# to simplify, remove year and quarters
<- debt %>%
debt select(-contains("Year")) %>%
# move Date to front
select(Date, 1:last_col(0))
# view data
debt
# A tibble: 74 x 8
Date Mortgage `HE Revolving` `Auto Loan` `Credit Card`
<dttm> <dbl> <dbl> <dbl> <dbl>
1 2003-01-01 00:00:00 4.94 0.242 0.641 0.688
2 2003-04-01 00:00:00 5.08 0.26 0.622 0.693
3 2003-07-01 00:00:00 5.18 0.269 0.684 0.693
4 2003-10-01 00:00:00 5.66 0.302 0.704 0.698
5 2004-01-01 00:00:00 5.84 0.328 0.72 0.695
6 2004-04-01 00:00:00 5.97 0.367 0.743 0.697
7 2004-07-01 00:00:00 6.21 0.426 0.751 0.706
8 2004-10-01 00:00:00 6.36 0.468 0.728 0.717
9 2005-01-01 00:00:00 6.51 0.502 0.725 0.71
10 2005-04-01 00:00:00 6.70 0.528 0.774 0.717
# i 64 more rows
# i 3 more variables: `Student Loan` <dbl>, Other <dbl>, Total <dbl>
<-FedFundsRate
fed #create calendar date column out of original columns
<- paste(fed$Year, fed$Month, fed$Day)
fed_calendar_date # mutate new column, create new calendar date column
<- fed %>%
fed mutate(Date = as.Date(fed_calendar_date,format = "%Y %m %d"), .before=`Year`)
View()
Error in as.data.frame(x): argument "x" is missing, with no default
# filter out old date columns
<- fed[-c(2,3,4)]
fed fed
# A tibble: 904 x 8
Date `Federal Funds Target Rate` `Federal Funds Upper Target`
<date> <dbl> <dbl>
1 1954-07-01 NA NA
2 1954-08-01 NA NA
3 1954-09-01 NA NA
4 1954-10-01 NA NA
5 1954-11-01 NA NA
6 1954-12-01 NA NA
7 1955-01-01 NA NA
8 1955-02-01 NA NA
9 1955-03-01 NA NA
10 1955-04-01 NA NA
# i 894 more rows
# i 5 more variables: `Federal Funds Lower Target` <dbl>,
# `Effective Federal Funds Rate` <dbl>, `Real GDP (Percent Change)` <dbl>,
# `Unemployment Rate` <dbl>, `Inflation Rate` <dbl>
Join Data
<- fed %>%
joined right_join(debt, by=join_by(Date)) %>%
rename(Other_Debt=Other,Total_Debt=Total)
joined
# A tibble: 74 x 15
Date `Federal Funds Target Rate` `Federal Funds Upper Target`
<dttm> <dbl> <dbl>
1 2003-01-01 00:00:00 1.25 NA
2 2003-04-01 00:00:00 1.25 NA
3 2003-07-01 00:00:00 1 NA
4 2003-10-01 00:00:00 1 NA
5 2004-01-01 00:00:00 1 NA
6 2004-04-01 00:00:00 1 NA
7 2004-07-01 00:00:00 1.25 NA
8 2004-10-01 00:00:00 1.75 NA
9 2005-01-01 00:00:00 2.25 NA
10 2005-04-01 00:00:00 2.75 NA
# i 64 more rows
# i 12 more variables: `Federal Funds Lower Target` <dbl>,
# `Effective Federal Funds Rate` <dbl>, `Real GDP (Percent Change)` <dbl>,
# `Unemployment Rate` <dbl>, `Inflation Rate` <dbl>, Mortgage <dbl>,
# `HE Revolving` <dbl>, `Auto Loan` <dbl>, `Credit Card` <dbl>,
# `Student Loan` <dbl>, Other_Debt <dbl>, Total_Debt <dbl>
# filter date so data is present for both sets
<- joined %>%
joined filter(Date < '2017-01-01')
joined
# A tibble: 57 x 15
Date `Federal Funds Target Rate` `Federal Funds Upper Target`
<dttm> <dbl> <dbl>
1 2003-01-01 00:00:00 1.25 NA
2 2003-04-01 00:00:00 1.25 NA
3 2003-07-01 00:00:00 1 NA
4 2003-10-01 00:00:00 1 NA
5 2004-01-01 00:00:00 1 NA
6 2004-04-01 00:00:00 1 NA
7 2004-07-01 00:00:00 1.25 NA
8 2004-10-01 00:00:00 1.75 NA
9 2005-01-01 00:00:00 2.25 NA
10 2005-04-01 00:00:00 2.75 NA
# i 47 more rows
# i 12 more variables: `Federal Funds Lower Target` <dbl>,
# `Effective Federal Funds Rate` <dbl>, `Real GDP (Percent Change)` <dbl>,
# `Unemployment Rate` <dbl>, `Inflation Rate` <dbl>, Mortgage <dbl>,
# `HE Revolving` <dbl>, `Auto Loan` <dbl>, `Credit Card` <dbl>,
# `Student Loan` <dbl>, Other_Debt <dbl>, Total_Debt <dbl>
View(joined)