Code
library(tidyverse)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Janani Natarajan
May 8, 2023
Read in one (or more) of the following datasets, using the correct R package and command.
Year and Quarter
03:Q1 03:Q2 03:Q3 03:Q4 04:Q1 04:Q2 04:Q3 04:Q4 05:Q1 05:Q2 05:Q3 05:Q4 06:Q1
1 1 1 1 1 1 1 1 1 1 1 1 1
06:Q2 06:Q3 06:Q4 07:Q1 07:Q2 07:Q3 07:Q4 08:Q1 08:Q2 08:Q3 08:Q4 09:Q1 09:Q2
1 1 1 1 1 1 1 1 1 1 1 1 1
09:Q3 09:Q4 10:Q1 10:Q2 10:Q3 10:Q4 11:Q1 11:Q2 11:Q3 11:Q4 12:Q1 12:Q2 12:Q3
1 1 1 1 1 1 1 1 1 1 1 1 1
12:Q4 13:Q1 13:Q2 13:Q3 13:Q4 14:Q1 14:Q2 14:Q3 14:Q4 15:Q1 15:Q2 15:Q3 15:Q4
1 1 1 1 1 1 1 1 1 1 1 1 1
16:Q1 16:Q2 16:Q3 16:Q4 17:Q1 17:Q2 17:Q3 17:Q4 18:Q1 18:Q2 18:Q3 18:Q4 19:Q1
1 1 1 1 1 1 1 1 1 1 1 1 1
19:Q2 19:Q3 19:Q4 20:Q1 20:Q2 20:Q3 20:Q4 21:Q1 21:Q2
1 1 1 1 1 1 1 1 1
The Year and Quarter variable has to be mutated to make analysis more efficient. Since Year and Quarter denotes a date or time, this is not right and needs to be converted to Date object.
library(lubridate)
debt_t3 <- debt_t2 %>%
mutate(year = str_c("20", str_sub(`Year and Quarter`, 1, 2)),quarter = str_sub(`Year and Quarter`, 5, 5), year_and_quarter = quarter(as_date(str_c(year, quarter), format="%Y%q"), with_year=TRUE)) %>%
select(-c(`Year and Quarter`, `year`, `quarter`)) %>%
relocate(debt_type, debt_value, year_and_quarter) %>%
mutate(debt_value = str_remove(as.character(debt_value), ".0+$"))
---
title: "Challenge 4"
author: "Janani Natarajan"
desription: "More data wrangling: pivoting"
date: "05/08/23"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_4
- abc_poll
- eggs
- fed_rates
- hotel_bookings
- debt
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
## Read in data
Read in one (or more) of the following datasets, using the correct R package and command.
- abc_poll.csv ⭐
- poultry_tidy.xlsx or organiceggpoultry.xls⭐⭐
- FedFundsRate.csv⭐⭐⭐
- hotel_bookings.csv⭐⭐⭐⭐
- debt_in_trillions.xlsx ⭐⭐⭐⭐⭐
```{r}
debt_t <- readxl::read_excel("_data/debt_in_trillions.xlsx", sheet="Sheet1")
table(select(debt_t, `Year and Quarter`))
```
### Briefly describe the data
```{r}
view(debt_t)
ncol(debt_t)
```
```{r}
nrow(debt_t)
```
```{r}
nrow(debt_t) * (ncol(debt_t)-1)
```
```{r}
debt_t2 <- debt_t %>%
pivot_longer(col = -c(`Year and Quarter`), names_to="debt_type", values_to = "debt_value")
```
## Identify variables that need to be mutated
The Year and Quarter variable has to be mutated to make analysis more efficient. Since Year and Quarter denotes a date or time, this is not right and needs to be converted to Date object.
```{r}
library(lubridate)
debt_t3 <- debt_t2 %>%
mutate(year = str_c("20", str_sub(`Year and Quarter`, 1, 2)),quarter = str_sub(`Year and Quarter`, 5, 5), year_and_quarter = quarter(as_date(str_c(year, quarter), format="%Y%q"), with_year=TRUE)) %>%
select(-c(`Year and Quarter`, `year`, `quarter`)) %>%
relocate(debt_type, debt_value, year_and_quarter) %>%
mutate(debt_value = str_remove(as.character(debt_value), ".0+$"))
```
```{r}
head(debt_t3)
```