<- 100 # sample size
n <- seq(1,10) # means
m <- map(m,rnorm,n=n) samps
Challenge 10
Challenge Overview
The [purrr](https://purrr.tidyverse.org package is a powerful tool for functional programming. It allows the user to apply a single function across multiple objects. It can replace for loops with a more readable (and often faster) simple function call.
For example, we can draw n
random samples from 10 different distributions using a vector of 10 means.
We can then use map_dbl
to verify that this worked correctly by computing the mean for each sample.
%>%
samps map_dbl(mean)
[1] 0.7930265 1.9886684 2.9215061 4.1315976 5.1183208 6.0243612 7.1483536
[8] 7.9461430 8.9145070 9.8369880
purrr
is tricky to learn (but beyond useful once you get a handle on it). Therefore, it’s imperative that you complete the purr
and map
readings before attempting this challenge.
The challenge
Use purrr
with a function to perform some data science task. What this task is is up to you. It could involve computing summary statistics, reading in multiple datasets, running a random process multiple times, or anything else you might need to do in your work as a data analyst. You might consider using purrr
with a function you wrote for challenge 9.
<- read_csv("_data/railroad_2012_clean_county.csv") railroad
map_chr(railroad, class)
state county total_employees
"character" "character" "numeric"
The class for the railroad data set are: character, character, and numeric vectors.
<- railroad %>%
railroad_character map_df(as.character)
glimpse(railroad_character)
Rows: 2,930
Columns: 3
$ state <chr> "AE", "AK", "AK", "AK", "AK", "AK", "AK", "AL", "AL", …
$ county <chr> "APO", "ANCHORAGE", "FAIRBANKS NORTH STAR", "JUNEAU", …
$ total_employees <chr> "2", "7", "2", "3", "2", "1", "88", "102", "143", "1",…
I’ll use map to now list all columns as characters.