2003-2021 Average Debt of Various Household Expenses

challenge_4

Michele Carlin

debt

Author

Michele Carlin

Published

March 20, 2023

Code

library(tidyverse)
library(summarytools)
library(dbplyr)
library(readxl)

knitr::opts_chunk$set(echo = TRUE)

Read in and view summary of ‘debt_in_trillions’ dataset.

Code

sheet_names <- excel_sheets("_data/debt_in_trillions.xlsx")
sheet_names

[1] "Sheet1"

Code

debt_trillions <- read_xlsx ("_data/debt_in_trillions.xlsx")
View(debt_trillions)
view(dfSummary(debt_trillions))

Switching method to 'browser'

Output file written: C:\Users\CarlinML\AppData\Local\Temp\RtmpKOdhK1\file84481a057024.html

This dataset contains 74 rows and 8 columns. Each row represents a particular year/quarter and provides the average debt in trillions for various types of household expenses (e.g., mortgage, auto loan, credit card, etc.).

Pivot longer so that each row contains one observation.

Code

debt_trillions_long <- pivot_longer(debt_trillions, col = c(Mortgage, 'HE Revolving', 'Auto Loan', 'Credit Card', 'Student Loan', Other, Total),
                 names_to="debt_type",
                 values_to = "value")
View(debt_trillions_long)
view(dfSummary(debt_trillions_long))

Switching method to 'browser'

Output file written: C:\Users\CarlinML\AppData\Local\Temp\RtmpKOdhK1\file844863283ffd.html

Separate year and quarter into separate variables. Summarize average debt by type of debt.

Code

debt_trillions_long %>%
  separate("Year and Quarter", c("year", "quarter"), ":") %>%
 group_by(year, debt_type) %>%
  summarise(Avg_Debt = mean(value),
            Min_Debt = min(value),
            Max_Debt = max(value))

`summarise()` has grouped output by 'year'. You can override using the
`.groups` argument.

# A tibble: 133 × 5
# Groups:   year [19]
   year  debt_type    Avg_Debt Min_Debt Max_Debt
   <chr> <chr>           <dbl>    <dbl>    <dbl>
 1 03    Auto Loan       0.663    0.622    0.704
 2 03    Credit Card     0.693    0.688    0.698
 3 03    HE Revolving    0.268    0.242    0.302
 4 03    Mortgage        5.22     4.94     5.66 
 5 03    Other           0.472    0.449    0.486
 6 03    Student Loan    0.246    0.241    0.253
 7 03    Total           7.56     7.23     8.07 
 8 04    Auto Loan       0.736    0.72     0.751
 9 04    Credit Card     0.704    0.695    0.717
10 04    HE Revolving    0.397    0.328    0.468
# … with 123 more rows

Summarize total debt only by year.

Code

debt_trillions_long %>%
  separate("Year and Quarter", c("year", "quarter"), ":") %>%
  filter(debt_type == 'Total') %>%
  group_by(year, debt_type) %>%
  summarise(Avg_Debt = mean(value),
            Min_Debt = min(value),
            Max_Debt = max(value))

`summarise()` has grouped output by 'year'. You can override using the
`.groups` argument.

# A tibble: 19 × 5
# Groups:   year [19]
   year  debt_type Avg_Debt Min_Debt Max_Debt
   <chr> <chr>        <dbl>    <dbl>    <dbl>
 1 03    Total         7.56     7.23     8.07
 2 04    Total         8.66     8.29     9.04
 3 05    Total         9.62     9.21    10.0 
 4 06    Total        10.9     10.4     11.3 
 5 07    Total        12.0     11.5     12.4 
 6 08    Total        12.6     12.5     12.7 
 7 09    Total        12.3     12.2     12.5 
 8 10    Total        11.9     11.7     12.1 
 9 11    Total        11.7     11.5     11.8 
10 12    Total        11.4     11.3     11.4 
11 13    Total        11.3     11.2     11.5 
12 14    Total        11.7     11.6     11.8 
13 15    Total        12.0     11.9     12.1 
14 16    Total        12.4     12.3     12.6 
15 17    Total        12.9     12.7     13.1 
16 18    Total        13.4     13.2     13.5 
17 19    Total        13.9     13.7     14.1 
18 20    Total        14.4     14.3     14.6 
19 21    Total        14.8     14.6     15.0

Read in and view summary of ‘abc polls’ dataset.

Code

abc_poll <- read_csv ("_data/abc_poll_2021.csv")

Rows: 527 Columns: 31
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (28): xspanish, complete_status, ppeduc5, ppeducat, ppgender, ppethm, pp...
dbl  (3): id, ppage, weights_pid

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Code

View(abc_poll)
view(dfSummary(abc_poll))

Switching method to 'browser'
Output file written: C:\Users\CarlinML\AppData\Local\Temp\RtmpKOdhK1\file84486fb97ab3.html

Remove ‘skipped’ answers from Q1_a variable.

Code

table(select(abc_poll, Q1_a))

Q1_a
   Approve Disapprove    Skipped 
       329        193          5

Code

abc_poll <- abc_poll%>%
  mutate(Q1_a = str_remove(Q1_a, "Skipped"))

table(select(abc_poll, Q1_a))

Q1_a
              Approve Disapprove 
         5        329        193

Specify the order of the ppinc7 values.

Code

ppinc7 <- factor('ppinc7', ordered = TRUE, levels = c("$150,000 or more", "$100,000 to $149,999", "$75,000 to $99,999", "$50,000 to $74,999", "$25,000 to $49,999", "$10,000 to $24,999", "Less than $10,000"))
ppinc7

[1] <NA>
7 Levels: $150,000 or more < $100,000 to $149,999 < ... < Less than $10,000

Code

table(ppinc7)

ppinc7
    $150,000 or more $100,000 to $149,999   $75,000 to $99,999 
                   0                    0                    0 
  $50,000 to $74,999   $25,000 to $49,999   $10,000 to $24,999 
                   0                    0                    0 
   Less than $10,000 
                   0