Challenge 10

challenge_10

purrr

Author

Tenzin Latoe

Published

July 20, 2023

Challenge Overview

The [purrr](https://purrr.tidyverse.org package is a powerful tool for functional programming. It allows the user to apply a single function across multiple objects. It can replace for loops with a more readable (and often faster) simple function call.

For example, we can draw n random samples from 10 different distributions using a vector of 10 means.

n <- 100 # sample size
m <- seq(1,10) # means 
samps <- map(m,rnorm,n=n)

We can then use map_dbl to verify that this worked correctly by computing the mean for each sample.

samps %>%
  map_dbl(mean)

 [1] 0.7930265 1.9886684 2.9215061 4.1315976 5.1183208 6.0243612 7.1483536
 [8] 7.9461430 8.9145070 9.8369880

purrr is tricky to learn (but beyond useful once you get a handle on it). Therefore, it’s imperative that you complete the purr and map readings before attempting this challenge.

The challenge

Use purrr with a function to perform some data science task. What this task is is up to you. It could involve computing summary statistics, reading in multiple datasets, running a random process multiple times, or anything else you might need to do in your work as a data analyst. You might consider using purrr with a function you wrote for challenge 9.

railroad <- read_csv("_data/railroad_2012_clean_county.csv")

map_chr(railroad, class)

          state          county total_employees 
    "character"     "character"       "numeric"

The class for the railroad data set are: character, character, and numeric vectors.

railroad_character <- railroad %>% 
  map_df(as.character)
glimpse(railroad_character)

Rows: 2,930
Columns: 3
$ state           <chr> "AE", "AK", "AK", "AK", "AK", "AK", "AK", "AL", "AL", …
$ county          <chr> "APO", "ANCHORAGE", "FAIRBANKS NORTH STAR", "JUNEAU", …
$ total_employees <chr> "2", "7", "2", "3", "2", "1", "88", "102", "143", "1",…

I’ll use map to now list all columns as characters.