Code
library(tidyverse)
library(summarytools)
library(dbplyr)
library(readxl)
::opts_chunk$set(echo = TRUE) knitr
Michele Carlin
March 20, 2023
Read in and view summary of ‘debt_in_trillions’ dataset.
[1] "Sheet1"
Switching method to 'browser'
Output file written: C:\Users\CarlinML\AppData\Local\Temp\RtmpKOdhK1\file84481a057024.html
This dataset contains 74 rows and 8 columns. Each row represents a particular year/quarter and provides the average debt in trillions for various types of household expenses (e.g., mortgage, auto loan, credit card, etc.).
Pivot longer so that each row contains one observation.
Switching method to 'browser'
Output file written: C:\Users\CarlinML\AppData\Local\Temp\RtmpKOdhK1\file844863283ffd.html
Separate year and quarter into separate variables. Summarize average debt by type of debt.
`summarise()` has grouped output by 'year'. You can override using the
`.groups` argument.
# A tibble: 133 × 5
# Groups: year [19]
year debt_type Avg_Debt Min_Debt Max_Debt
<chr> <chr> <dbl> <dbl> <dbl>
1 03 Auto Loan 0.663 0.622 0.704
2 03 Credit Card 0.693 0.688 0.698
3 03 HE Revolving 0.268 0.242 0.302
4 03 Mortgage 5.22 4.94 5.66
5 03 Other 0.472 0.449 0.486
6 03 Student Loan 0.246 0.241 0.253
7 03 Total 7.56 7.23 8.07
8 04 Auto Loan 0.736 0.72 0.751
9 04 Credit Card 0.704 0.695 0.717
10 04 HE Revolving 0.397 0.328 0.468
# … with 123 more rows
Summarize total debt only by year.
`summarise()` has grouped output by 'year'. You can override using the
`.groups` argument.
# A tibble: 19 × 5
# Groups: year [19]
year debt_type Avg_Debt Min_Debt Max_Debt
<chr> <chr> <dbl> <dbl> <dbl>
1 03 Total 7.56 7.23 8.07
2 04 Total 8.66 8.29 9.04
3 05 Total 9.62 9.21 10.0
4 06 Total 10.9 10.4 11.3
5 07 Total 12.0 11.5 12.4
6 08 Total 12.6 12.5 12.7
7 09 Total 12.3 12.2 12.5
8 10 Total 11.9 11.7 12.1
9 11 Total 11.7 11.5 11.8
10 12 Total 11.4 11.3 11.4
11 13 Total 11.3 11.2 11.5
12 14 Total 11.7 11.6 11.8
13 15 Total 12.0 11.9 12.1
14 16 Total 12.4 12.3 12.6
15 17 Total 12.9 12.7 13.1
16 18 Total 13.4 13.2 13.5
17 19 Total 13.9 13.7 14.1
18 20 Total 14.4 14.3 14.6
19 21 Total 14.8 14.6 15.0
Read in and view summary of ‘abc polls’ dataset.
Rows: 527 Columns: 31
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (28): xspanish, complete_status, ppeduc5, ppeducat, ppgender, ppethm, pp...
dbl (3): id, ppage, weights_pid
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Switching method to 'browser'
Output file written: C:\Users\CarlinML\AppData\Local\Temp\RtmpKOdhK1\file84486fb97ab3.html
Remove ‘skipped’ answers from Q1_a variable.
Q1_a
Approve Disapprove Skipped
329 193 5
Q1_a
Approve Disapprove
5 329 193
Specify the order of the ppinc7 values.
[1] <NA>
7 Levels: $150,000 or more < $100,000 to $149,999 < ... < Less than $10,000
ppinc7
$150,000 or more $100,000 to $149,999 $75,000 to $99,999
0 0 0
$50,000 to $74,999 $25,000 to $49,999 $10,000 to $24,999
0 0 0
Less than $10,000
0
---
title: "2003-2021 Average Debt of Various Household Expenses"
author: "Michele Carlin"
desription: "Imported, Tidied, and Summarized data"
date: "03/20/2023"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_4
- Michele Carlin
- debt
---
```{r}
#| label: setup
#| warning: false
library(tidyverse)
library(summarytools)
library(dbplyr)
library(readxl)
knitr::opts_chunk$set(echo = TRUE)
```
Read in and view summary of 'debt_in_trillions' dataset.
```{r}
sheet_names <- excel_sheets("_data/debt_in_trillions.xlsx")
sheet_names
debt_trillions <- read_xlsx ("_data/debt_in_trillions.xlsx")
View(debt_trillions)
view(dfSummary(debt_trillions))
```
This dataset contains 74 rows and 8 columns. Each row represents a particular year/quarter and provides the average debt in trillions for various types of household expenses (e.g., mortgage, auto loan, credit card, etc.).
Pivot longer so that each row contains one observation.
```{r}
debt_trillions_long <- pivot_longer(debt_trillions, col = c(Mortgage, 'HE Revolving', 'Auto Loan', 'Credit Card', 'Student Loan', Other, Total),
names_to="debt_type",
values_to = "value")
View(debt_trillions_long)
view(dfSummary(debt_trillions_long))
```
Separate year and quarter into separate variables.
Summarize average debt by type of debt.
```{r}
debt_trillions_long %>%
separate("Year and Quarter", c("year", "quarter"), ":") %>%
group_by(year, debt_type) %>%
summarise(Avg_Debt = mean(value),
Min_Debt = min(value),
Max_Debt = max(value))
```
Summarize total debt only by year.
```{r}
debt_trillions_long %>%
separate("Year and Quarter", c("year", "quarter"), ":") %>%
filter(debt_type == 'Total') %>%
group_by(year, debt_type) %>%
summarise(Avg_Debt = mean(value),
Min_Debt = min(value),
Max_Debt = max(value))
```
Read in and view summary of 'abc polls' dataset.
```{r}
abc_poll <- read_csv ("_data/abc_poll_2021.csv")
View(abc_poll)
view(dfSummary(abc_poll))
```
Remove 'skipped' answers from Q1_a variable.
```{r}
table(select(abc_poll, Q1_a))
abc_poll <- abc_poll%>%
mutate(Q1_a = str_remove(Q1_a, "Skipped"))
table(select(abc_poll, Q1_a))
```
Specify the order of the ppinc7 values.
```{r}
ppinc7 <- factor('ppinc7', ordered = TRUE, levels = c("$150,000 or more", "$100,000 to $149,999", "$75,000 to $99,999", "$50,000 to $74,999", "$25,000 to $49,999", "$10,000 to $24,999", "Less than $10,000"))
ppinc7
table(ppinc7)
```