Code
library(tidyverse)
library(stringr)
library(skimr)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Ananya Pujary
August 18, 2022
I’ll be reading in the ‘FedFundsRate.csv’ dataset.
Generating an overview of the data.
Name | fed_funds_rate |
Number of rows | 904 |
Number of columns | 10 |
_______________________ | |
Column type frequency: | |
numeric | 10 |
________________________ | |
Group variables | None |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
Year | 0 | 1.00 | 1986.68 | 17.17 | 1954.00 | 1973.00 | 1987.50 | 2001.00 | 2017.00 | ▅▆▇▇▆ |
Month | 0 | 1.00 | 6.60 | 3.47 | 1.00 | 4.00 | 7.00 | 10.00 | 12.00 | ▇▅▅▅▇ |
Day | 0 | 1.00 | 3.60 | 6.79 | 1.00 | 1.00 | 1.00 | 1.00 | 31.00 | ▇▁▁▁▁ |
Federal Funds Target Rate | 442 | 0.51 | 5.66 | 2.55 | 1.00 | 3.75 | 5.50 | 7.75 | 11.50 | ▅▅▇▅▂ |
Federal Funds Upper Target | 801 | 0.11 | 0.31 | 0.14 | 0.25 | 0.25 | 0.25 | 0.25 | 1.00 | ▇▁▁▁▁ |
Federal Funds Lower Target | 801 | 0.11 | 0.06 | 0.14 | 0.00 | 0.00 | 0.00 | 0.00 | 0.75 | ▇▁▁▁▁ |
Effective Federal Funds Rate | 152 | 0.83 | 4.91 | 3.61 | 0.07 | 2.43 | 4.70 | 6.58 | 19.10 | ▇▇▃▁▁ |
Real GDP (Percent Change) | 654 | 0.28 | 3.14 | 3.60 | -10.00 | 1.40 | 3.10 | 4.88 | 16.50 | ▁▂▇▂▁ |
Unemployment Rate | 152 | 0.83 | 5.98 | 1.57 | 3.40 | 4.90 | 5.70 | 7.00 | 10.80 | ▅▇▅▁▁ |
Inflation Rate | 194 | 0.79 | 3.73 | 2.57 | 0.60 | 2.00 | 2.80 | 4.70 | 13.60 | ▇▃▁▁▁ |
There are 904 rows and 10 columns in this dataset, all of which are numeric. It includes information on federal fund targets, unemployment rate, real GDP, and inflation rate between the years of 1954 and 2017. There seem to be a lot of missing values.
The ‘Federal Funds Target Rate’ varies between 1 and 11.5 percent, and the ‘Effective Federal Funds Target Rate’ varies between 0.07 and 19.1 percent. ‘Real GDP (Percent Change)’ lies between -10 and 16.5 percent, ‘Unemployment Rate’ between 3.4 and 10.8 percent, and ‘Inflation Rate’ between 0.6 and 13.6 percent.
I think that converting the missing values to a numeric value like ‘0.0’ would hold some weight and not make much sense, since some of the existing column values are ‘0.0’. The columns ‘Year’, ‘Month’, and ‘Day’ can be combined to give a comprehensive date for each row.
chr [1:904] "1954-7-1" "1954-8-1" "1954-9-1" "1954-10-1" "1954-11-1" ...
Date[1:904], format: "1954-07-01" "1954-08-01" "1954-09-01" "1954-10-01" "1954-11-01" ...
After combining the three columns (‘Year’,‘Month’,‘Day’) into a new column ‘Date’, its data type was still ‘character’. I used the as.Date() function to convert the values to the date type and stored them in a new column, ‘Dates’. I removed the ‘Year’,‘Month’,‘Day’, and (old) ‘Date’ columns as well. I’m also reordering the columns such that ‘Dates’ is the first one.
Besides this, I think that the dataset is tidy enough to work with.
---
title: "Challenge 4"
author: "Ananya Pujary"
description: "More data wrangling: pivoting"
date: "08/18/2022"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_4
- fedfundsrate
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
library(stringr)
library(skimr)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
## Read in data
I'll be reading in the 'FedFundsRate.csv' dataset.
```{r}
#| label: reading in the data
fed_funds_rate<-read_csv("_data/FedFundsRate.csv",
show_col_types = FALSE)
```
### Briefly describe the data
Generating an overview of the data.
```{r}
#| label: data description
skim(fed_funds_rate)
```
There are 904 rows and 10 columns in this dataset, all of which are numeric. It includes information on federal fund targets, unemployment rate, real GDP, and inflation rate between the years of 1954 and 2017. There seem to be a lot of missing values.
The 'Federal Funds Target Rate' varies between 1 and 11.5 percent, and the 'Effective Federal Funds Target Rate' varies between 0.07 and 19.1 percent. 'Real GDP (Percent Change)' lies between -10 and 16.5 percent, 'Unemployment Rate' between 3.4 and 10.8 percent, and 'Inflation Rate' between 0.6 and 13.6 percent.
## Tidy Data and Mutate Variables
I think that converting the missing values to a numeric value like '0.0' would hold some weight and not make much sense, since some of the existing column values are '0.0'. The columns 'Year', 'Month', and 'Day' can be combined to give a comprehensive date for each row.
```{r}
#| label: tidying the data 1
fed_funds_rate$Date <- str_c(fed_funds_rate$Year,"-", fed_funds_rate$Month,"-",fed_funds_rate$Day)
str(fed_funds_rate$Date) # the data type is character
Dates <- as.Date(fed_funds_rate$Date)
fed_funds_rate$Dates <- as.Date(fed_funds_rate$Date, format="%Y-%m-%d")
str(fed_funds_rate$Dates) # the data type is date
fed_funds_rate_final <- fed_funds_rate %>%
select(-Date, -Year, -Month, -Day)
```
After combining the three columns ('Year','Month','Day') into a new column 'Date', its data type was still 'character'. I used the as.Date() function to convert the values to the date type and stored them in a new column, 'Dates'. I removed the 'Year','Month','Day', and (old) 'Date' columns as well. I'm also reordering the columns such that 'Dates' is the first one.
```{r}
#| label: tidying the data 2
fed_funds_rate_final <- fed_funds_rate_final[, c(8,1,2,3,4,5,6,7)]
```
Besides this, I think that the dataset is tidy enough to work with.