Code
library(tidyverse)
library(readxl)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Mani Shanker Kamarapu
August 18, 2022
The data set is about the debt in trillions in different years and quarters and various types of loans. It gives the detailed description and data from 2003-2021 in four quarters.
# A tibble: 74 × 7
`Year and Quarter` Mortgage `HE Revolving` `Auto Loan` Credit…¹ Stude…² Other
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 03:Q1 4.94 0.242 0.641 0.688 0.241 0.478
2 03:Q2 5.08 0.26 0.622 0.693 0.243 0.486
3 03:Q3 5.18 0.269 0.684 0.693 0.249 0.477
4 03:Q4 5.66 0.302 0.704 0.698 0.253 0.449
5 04:Q1 5.84 0.328 0.72 0.695 0.260 0.446
6 04:Q2 5.97 0.367 0.743 0.697 0.263 0.423
7 04:Q3 6.21 0.426 0.751 0.706 0.33 0.41
8 04:Q4 6.36 0.468 0.728 0.717 0.346 0.423
9 05:Q1 6.51 0.502 0.725 0.71 0.364 0.394
10 05:Q2 6.70 0.528 0.774 0.717 0.374 0.402
# … with 64 more rows, and abbreviated variable names ¹`Credit Card`,
# ²`Student Loan`
# ℹ Use `print(n = ...)` to see more rows
This will give us the dimensions of the data set and which is 74 rows and 7 columns.
Year and Quarter Mortgage HE Revolving Auto Loan
Length:74 Min. : 4.942 Min. :0.2420 Min. :0.6220
Class :character 1st Qu.: 8.036 1st Qu.:0.4275 1st Qu.:0.7430
Mode :character Median : 8.412 Median :0.5165 Median :0.8145
Mean : 8.274 Mean :0.5161 Mean :0.9309
3rd Qu.: 9.047 3rd Qu.:0.6172 3rd Qu.:1.1515
Max. :10.442 Max. :0.7140 Max. :1.4150
Credit Card Student Loan Other
Min. :0.6590 Min. :0.2407 Min. :0.2960
1st Qu.:0.6966 1st Qu.:0.5333 1st Qu.:0.3414
Median :0.7375 Median :0.9088 Median :0.3921
Mean :0.7565 Mean :0.9189 Mean :0.3831
3rd Qu.:0.8165 3rd Qu.:1.3022 3rd Qu.:0.4154
Max. :0.9270 Max. :1.5840 Max. :0.4860
The above observation helps us with the mean values of the debts and we can get an overall grasp of the situation through years. And further we can also draw a plot representing the mean values and type of debt.
As you can observe that the data set is still untidy so we use two important functions to make it tidy, those are pivot longer
and pivot wider
. And also first I will use separate
function to split year an quarter into two separate columns. Now we use pivot_longer() to convert types of debt variables into single column so we can have each observation for each row. And at last convert quarter rows into columns so we can have each variable for each column.
# A tibble: 114 × 6
Year `Type of loan` Q1 Q2 Q3 Q4
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 03 Mortgage 4.94 5.08 5.18 5.66
2 03 HE Revolving 0.242 0.26 0.269 0.302
3 03 Auto Loan 0.641 0.622 0.684 0.704
4 03 Credit Card 0.688 0.693 0.693 0.698
5 03 Student Loan 0.241 0.243 0.249 0.253
6 03 Other 0.478 0.486 0.477 0.449
7 04 Mortgage 5.84 5.97 6.21 6.36
8 04 HE Revolving 0.328 0.367 0.426 0.468
9 04 Auto Loan 0.72 0.743 0.751 0.728
10 04 Credit Card 0.695 0.697 0.706 0.717
# … with 104 more rows
# ℹ Use `print(n = ...)` to see more rows
[1] 114 6
Year Type of loan Q1 Q2
Length:114 Length:114 Min. : 0.2407 Min. : 0.2429
Class :character Class :character 1st Qu.: 0.4614 1st Qu.: 0.4585
Mode :character Mode :character Median : 0.7170 Median : 0.7340
Mean : 1.9518 Mean : 1.9661
3rd Qu.: 1.2190 3rd Qu.: 1.2260
Max. :10.1600 Max. :10.4420
Q3 Q4
Min. :0.2488 Min. : 0.2529
1st Qu.:0.4660 1st Qu.: 0.4718
Median :0.7316 Median : 0.7313
Mean :1.9571 Mean : 1.9783
3rd Qu.:1.2055 3rd Qu.: 1.2237
Max. :9.8610 Max. :10.0430
NA's :6 NA's :6
The final dimensions of the tidy data set are 114 rows and 6 columns.
---
title: "Challenge 4"
author: "Mani Shanker Kamarapu"
desription: "More data wrangling: pivoting"
date: "08/18/2022"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_4
- debt_in_trillions
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
library(readxl)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
## Read in data
The data set is about the debt in trillions in different years and quarters and various types of loans. It gives the detailed description and data from 2003-2021 in four quarters.
```{r}
Debt <- read_excel("_data/debt_in_trillions.xlsx") %>%
select(!Total)
Debt
```
### Briefly describe the data
```{r}
dim(Debt)
```
This will give us the dimensions of the data set and which is 74 rows and 7 columns.
```{r}
summary(Debt)
```
The above observation helps us with the mean values of the debts and we can get an overall grasp of the situation through years. And further we can also draw a plot representing the mean values and type of debt.
## Tidy Data (as needed)
As you can observe that the data set is still untidy so we use two important functions to make it tidy, those are `pivot longer` and `pivot wider`. And also first I will use `separate` function to split year an quarter into two separate columns. Now we use pivot_longer() to convert types of debt variables into single column so we can have each observation for each row. And at last convert quarter rows into columns so we can have each variable for each column.
```{r}
Debt <- Debt %>%
separate("Year and Quarter", into = c("Year", "Quarter"), ":") %>%
pivot_longer(Mortgage:Other, names_to = "Type of loan", values_to = "Amount") %>%
pivot_wider(names_from = "Quarter", values_from = "Amount")
Debt
```
```{r}
dim(Debt)
summary(Debt)
```
The final dimensions of the tidy data set are 114 rows and 6 columns.