2003-2021 Average Debt of Various Household Expenses

challenge_4
Michele Carlin
debt
Author

Michele Carlin

Published

March 20, 2023

Code
library(tidyverse)
library(summarytools)
library(dbplyr)
library(readxl)

knitr::opts_chunk$set(echo = TRUE)

Read in and view summary of ‘debt_in_trillions’ dataset.

Code
sheet_names <- excel_sheets("_data/debt_in_trillions.xlsx")
sheet_names 
[1] "Sheet1"
Code
debt_trillions <- read_xlsx ("_data/debt_in_trillions.xlsx")
View(debt_trillions)
view(dfSummary(debt_trillions))
Switching method to 'browser'
Output file written: C:\Users\CarlinML\AppData\Local\Temp\RtmpKOdhK1\file84481a057024.html

This dataset contains 74 rows and 8 columns. Each row represents a particular year/quarter and provides the average debt in trillions for various types of household expenses (e.g., mortgage, auto loan, credit card, etc.).

Pivot longer so that each row contains one observation.

Code
debt_trillions_long <- pivot_longer(debt_trillions, col = c(Mortgage, 'HE Revolving', 'Auto Loan', 'Credit Card', 'Student Loan', Other, Total),
                 names_to="debt_type",
                 values_to = "value")
View(debt_trillions_long)
view(dfSummary(debt_trillions_long))
Switching method to 'browser'
Output file written: C:\Users\CarlinML\AppData\Local\Temp\RtmpKOdhK1\file844863283ffd.html

Separate year and quarter into separate variables. Summarize average debt by type of debt.

Code
debt_trillions_long %>%
  separate("Year and Quarter", c("year", "quarter"), ":") %>%
 group_by(year, debt_type) %>%
  summarise(Avg_Debt = mean(value),
            Min_Debt = min(value),
            Max_Debt = max(value))
`summarise()` has grouped output by 'year'. You can override using the
`.groups` argument.
# A tibble: 133 × 5
# Groups:   year [19]
   year  debt_type    Avg_Debt Min_Debt Max_Debt
   <chr> <chr>           <dbl>    <dbl>    <dbl>
 1 03    Auto Loan       0.663    0.622    0.704
 2 03    Credit Card     0.693    0.688    0.698
 3 03    HE Revolving    0.268    0.242    0.302
 4 03    Mortgage        5.22     4.94     5.66 
 5 03    Other           0.472    0.449    0.486
 6 03    Student Loan    0.246    0.241    0.253
 7 03    Total           7.56     7.23     8.07 
 8 04    Auto Loan       0.736    0.72     0.751
 9 04    Credit Card     0.704    0.695    0.717
10 04    HE Revolving    0.397    0.328    0.468
# … with 123 more rows

Summarize total debt only by year.

Code
debt_trillions_long %>%
  separate("Year and Quarter", c("year", "quarter"), ":") %>%
  filter(debt_type == 'Total') %>%
  group_by(year, debt_type) %>%
  summarise(Avg_Debt = mean(value),
            Min_Debt = min(value),
            Max_Debt = max(value))
`summarise()` has grouped output by 'year'. You can override using the
`.groups` argument.
# A tibble: 19 × 5
# Groups:   year [19]
   year  debt_type Avg_Debt Min_Debt Max_Debt
   <chr> <chr>        <dbl>    <dbl>    <dbl>
 1 03    Total         7.56     7.23     8.07
 2 04    Total         8.66     8.29     9.04
 3 05    Total         9.62     9.21    10.0 
 4 06    Total        10.9     10.4     11.3 
 5 07    Total        12.0     11.5     12.4 
 6 08    Total        12.6     12.5     12.7 
 7 09    Total        12.3     12.2     12.5 
 8 10    Total        11.9     11.7     12.1 
 9 11    Total        11.7     11.5     11.8 
10 12    Total        11.4     11.3     11.4 
11 13    Total        11.3     11.2     11.5 
12 14    Total        11.7     11.6     11.8 
13 15    Total        12.0     11.9     12.1 
14 16    Total        12.4     12.3     12.6 
15 17    Total        12.9     12.7     13.1 
16 18    Total        13.4     13.2     13.5 
17 19    Total        13.9     13.7     14.1 
18 20    Total        14.4     14.3     14.6 
19 21    Total        14.8     14.6     15.0 

Read in and view summary of ‘abc polls’ dataset.

Code
abc_poll <- read_csv ("_data/abc_poll_2021.csv")
Rows: 527 Columns: 31
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (28): xspanish, complete_status, ppeduc5, ppeducat, ppgender, ppethm, pp...
dbl  (3): id, ppage, weights_pid

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Code
View(abc_poll)
view(dfSummary(abc_poll))
Switching method to 'browser'
Output file written: C:\Users\CarlinML\AppData\Local\Temp\RtmpKOdhK1\file84486fb97ab3.html

Remove ‘skipped’ answers from Q1_a variable.

Code
table(select(abc_poll, Q1_a))
Q1_a
   Approve Disapprove    Skipped 
       329        193          5 
Code
abc_poll <- abc_poll%>%
  mutate(Q1_a = str_remove(Q1_a, "Skipped"))

table(select(abc_poll, Q1_a))
Q1_a
              Approve Disapprove 
         5        329        193 

Specify the order of the ppinc7 values.

Code
ppinc7 <- factor('ppinc7', ordered = TRUE, levels = c("$150,000 or more", "$100,000 to $149,999", "$75,000 to $99,999", "$50,000 to $74,999", "$25,000 to $49,999", "$10,000 to $24,999", "Less than $10,000"))
ppinc7
[1] <NA>
7 Levels: $150,000 or more < $100,000 to $149,999 < ... < Less than $10,000
Code
table(ppinc7)
ppinc7
    $150,000 or more $100,000 to $149,999   $75,000 to $99,999 
                   0                    0                    0 
  $50,000 to $74,999   $25,000 to $49,999   $10,000 to $24,999 
                   0                    0                    0 
   Less than $10,000 
                   0