challenge_4
debt_in_trillions
Author

Mani Shanker Kamarapu

Published

August 18, 2022

Code
library(tidyverse)
library(readxl)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Read in data

The data set is about the debt in trillions in different years and quarters and various types of loans. It gives the detailed description and data from 2003-2021 in four quarters.

Code
Debt <- read_excel("_data/debt_in_trillions.xlsx") %>%
  select(!Total)
Debt
# A tibble: 74 × 7
   `Year and Quarter` Mortgage `HE Revolving` `Auto Loan` Credit…¹ Stude…² Other
   <chr>                 <dbl>          <dbl>       <dbl>    <dbl>   <dbl> <dbl>
 1 03:Q1                  4.94          0.242       0.641    0.688   0.241 0.478
 2 03:Q2                  5.08          0.26        0.622    0.693   0.243 0.486
 3 03:Q3                  5.18          0.269       0.684    0.693   0.249 0.477
 4 03:Q4                  5.66          0.302       0.704    0.698   0.253 0.449
 5 04:Q1                  5.84          0.328       0.72     0.695   0.260 0.446
 6 04:Q2                  5.97          0.367       0.743    0.697   0.263 0.423
 7 04:Q3                  6.21          0.426       0.751    0.706   0.33  0.41 
 8 04:Q4                  6.36          0.468       0.728    0.717   0.346 0.423
 9 05:Q1                  6.51          0.502       0.725    0.71    0.364 0.394
10 05:Q2                  6.70          0.528       0.774    0.717   0.374 0.402
# … with 64 more rows, and abbreviated variable names ¹​`Credit Card`,
#   ²​`Student Loan`
# ℹ Use `print(n = ...)` to see more rows

Briefly describe the data

Code
dim(Debt)
[1] 74  7

This will give us the dimensions of the data set and which is 74 rows and 7 columns.

Code
summary(Debt)
 Year and Quarter      Mortgage       HE Revolving      Auto Loan     
 Length:74          Min.   : 4.942   Min.   :0.2420   Min.   :0.6220  
 Class :character   1st Qu.: 8.036   1st Qu.:0.4275   1st Qu.:0.7430  
 Mode  :character   Median : 8.412   Median :0.5165   Median :0.8145  
                    Mean   : 8.274   Mean   :0.5161   Mean   :0.9309  
                    3rd Qu.: 9.047   3rd Qu.:0.6172   3rd Qu.:1.1515  
                    Max.   :10.442   Max.   :0.7140   Max.   :1.4150  
  Credit Card      Student Loan        Other       
 Min.   :0.6590   Min.   :0.2407   Min.   :0.2960  
 1st Qu.:0.6966   1st Qu.:0.5333   1st Qu.:0.3414  
 Median :0.7375   Median :0.9088   Median :0.3921  
 Mean   :0.7565   Mean   :0.9189   Mean   :0.3831  
 3rd Qu.:0.8165   3rd Qu.:1.3022   3rd Qu.:0.4154  
 Max.   :0.9270   Max.   :1.5840   Max.   :0.4860  

The above observation helps us with the mean values of the debts and we can get an overall grasp of the situation through years. And further we can also draw a plot representing the mean values and type of debt.

Tidy Data (as needed)

As you can observe that the data set is still untidy so we use two important functions to make it tidy, those are pivot longer and pivot wider. And also first I will use separate function to split year an quarter into two separate columns. Now we use pivot_longer() to convert types of debt variables into single column so we can have each observation for each row. And at last convert quarter rows into columns so we can have each variable for each column.

Code
Debt <- Debt %>%
  separate("Year and Quarter", into = c("Year", "Quarter"), ":") %>%
  pivot_longer(Mortgage:Other, names_to = "Type of loan", values_to = "Amount") %>%
  pivot_wider(names_from = "Quarter", values_from = "Amount")
Debt
# A tibble: 114 × 6
   Year  `Type of loan`    Q1    Q2    Q3    Q4
   <chr> <chr>          <dbl> <dbl> <dbl> <dbl>
 1 03    Mortgage       4.94  5.08  5.18  5.66 
 2 03    HE Revolving   0.242 0.26  0.269 0.302
 3 03    Auto Loan      0.641 0.622 0.684 0.704
 4 03    Credit Card    0.688 0.693 0.693 0.698
 5 03    Student Loan   0.241 0.243 0.249 0.253
 6 03    Other          0.478 0.486 0.477 0.449
 7 04    Mortgage       5.84  5.97  6.21  6.36 
 8 04    HE Revolving   0.328 0.367 0.426 0.468
 9 04    Auto Loan      0.72  0.743 0.751 0.728
10 04    Credit Card    0.695 0.697 0.706 0.717
# … with 104 more rows
# ℹ Use `print(n = ...)` to see more rows
Code
dim(Debt)
[1] 114   6
Code
summary(Debt)
     Year           Type of loan             Q1                Q2         
 Length:114         Length:114         Min.   : 0.2407   Min.   : 0.2429  
 Class :character   Class :character   1st Qu.: 0.4614   1st Qu.: 0.4585  
 Mode  :character   Mode  :character   Median : 0.7170   Median : 0.7340  
                                       Mean   : 1.9518   Mean   : 1.9661  
                                       3rd Qu.: 1.2190   3rd Qu.: 1.2260  
                                       Max.   :10.1600   Max.   :10.4420  
                                                                          
       Q3               Q4         
 Min.   :0.2488   Min.   : 0.2529  
 1st Qu.:0.4660   1st Qu.: 0.4718  
 Median :0.7316   Median : 0.7313  
 Mean   :1.9571   Mean   : 1.9783  
 3rd Qu.:1.2055   3rd Qu.: 1.2237  
 Max.   :9.8610   Max.   :10.0430  
 NA's   :6        NA's   :6        

The final dimensions of the tidy data set are 114 rows and 6 columns.