Challenge 10 Submission

challenge_10
debt_in_trillions
purrr
Author

Suyash Bhagwat

Published

July 6, 2023

Challenge Overview

The purrr package is a powerful tool for functional programming. It allows the user to apply a single function across multiple objects. It can replace for loops with a more readable (and often faster) simple function call.

For example, we can draw n random samples from 10 different distributions using a vector of 10 means.

n <- 100 # sample size
m <- seq(1,10) # means 
samps <- map(m,rnorm,n=n) 

We can then use map_dbl to verify that this worked correctly by computing the mean for each sample.

samps %>%
  map_dbl(mean)
 [1] 0.890460 1.921804 2.978067 3.920082 5.087384 5.849617 7.044267 8.037781
 [9] 9.087611 9.813258

purrr is tricky to learn (but beyond useful once you get a handle on it). Therefore, it’s imperative that you complete the purr and map readings before attempting this challenge.

The challenge

Use purrr with a function to perform some data science task. What this task is is up to you. It could involve computing summary statistics, reading in multiple datasets, running a random process multiple times, or anything else you might need to do in your work as a data analyst. You might consider using purrr with a function you wrote for challenge 9.

Ans: For challenge 10, I’ll be using the debt_in_trillions.xlsx dataset to calculate the mean of all the numerical columns using the map function. The code for that is given below:

debt <- read_excel("_data/debt_in_trillions.xlsx")
glimpse(debt)
Rows: 74
Columns: 8
$ `Year and Quarter` <chr> "03:Q1", "03:Q2", "03:Q3", "03:Q4", "04:Q1", "04:Q2…
$ Mortgage           <dbl> 4.942, 5.080, 5.183, 5.660, 5.840, 5.967, 6.210, 6.…
$ `HE Revolving`     <dbl> 0.242, 0.260, 0.269, 0.302, 0.328, 0.367, 0.426, 0.…
$ `Auto Loan`        <dbl> 0.641, 0.622, 0.684, 0.704, 0.720, 0.743, 0.751, 0.…
$ `Credit Card`      <dbl> 0.688, 0.693, 0.693, 0.698, 0.695, 0.697, 0.706, 0.…
$ `Student Loan`     <dbl> 0.2407000, 0.2429000, 0.2488000, 0.2529000, 0.25980…
$ Other              <dbl> 0.4776, 0.4860, 0.4773, 0.4486, 0.4465, 0.4231, 0.4…
$ Total              <dbl> 7.2313, 7.3839, 7.5551, 8.0655, 8.2893, 8.4600, 8.8…
map_dbl(list(debt$Mortgage, debt$`HE Revolving`, debt$`Auto Loan`, debt$`Credit Card`, debt$`Student Loan`, debt$Other, debt$Total),mean)
[1]  8.2739865  0.5160662  0.9308703  0.7565176  0.9188575  0.3830770 11.7793751