<- 100 # sample size
n <- seq(1,10) # means
m <- map(m,rnorm,n=n) samps
Challenge 10 Solution
Challenge Overview
The purrr package is a powerful tool for functional programming. It allows the user to apply a single function across multiple objects. It can replace for loops with a more readable (and often faster) simple function call.
For example, we can draw n
random samples from 10 different distributions using a vector of 10 means.
We can then use map_dbl
to verify that this worked correctly by computing the mean for each sample.
%>%
samps map_dbl(mean)
[1] 1.052554 1.905349 3.034037 4.038838 5.151351 6.094199 7.032789 7.970365
[9] 8.977754 9.949969
purrr
is tricky to learn (but beyond useful once you get a handle on it). Therefore, it’s imperative that you complete the purr
and map
readings before attempting this challenge.
The challenge
Use purrr
with a function to perform some data science task. What this task is is up to you. It could involve computing summary statistics, reading in multiple datasets, running a random process multiple times, or anything else you might need to do in your work as a data analyst. You might consider using purrr
with a function you wrote for challenge 9.
I am using map_chr() from purrr along with my function that I created in challenge 9 as suggested. I am replacing multiple LOV’s in columns market_segment and meal with values that make the dataset more readable.
<- read_csv("_data/hotel_bookings.csv")
hotel_bookings head(hotel_bookings)
# A tibble: 6 × 32
hotel is_canceled lead_time arrival_date_year arrival_date_month
<chr> <dbl> <dbl> <dbl> <chr>
1 Resort Hotel 0 342 2015 July
2 Resort Hotel 0 737 2015 July
3 Resort Hotel 0 7 2015 July
4 Resort Hotel 0 13 2015 July
5 Resort Hotel 0 14 2015 July
6 Resort Hotel 0 14 2015 July
# ℹ 27 more variables: arrival_date_week_number <dbl>,
# arrival_date_day_of_month <dbl>, stays_in_weekend_nights <dbl>,
# stays_in_week_nights <dbl>, adults <dbl>, children <dbl>, babies <dbl>,
# meal <chr>, country <chr>, market_segment <chr>,
# distribution_channel <chr>, is_repeated_guest <dbl>,
# previous_cancellations <dbl>, previous_bookings_not_canceled <dbl>,
# reserved_room_type <chr>, assigned_room_type <chr>, …
unique(hotel_bookings$meal)
[1] "BB" "FB" "HB" "SC" "Undefined"
unique(hotel_bookings$market_segment)
[1] "Direct" "Corporate" "Online TA" "Offline TA/TO"
[5] "Complementary" "Groups" "Undefined" "Aviation"
# Define the replacement mappings
<- c("BB" = "Bed and Breakfast",
meal_replacements "HB" = "Half Board",
"FB" = "Full Board",
"SC" = "Self Catering")
<- c("Online TA" = "Online Travel Agent",
market_segment_replacements "Offline TA/TO" = "Offline Travel Agent/Tour Operator")
# Function to replace values
<- function(data, column_name, replacements) {
replace_values %>%
data mutate({{column_name}} := map_chr({{column_name}}, ~ ifelse(.x %in% names(replacements), replacements[.x], .x)))
}
# Replace meal values
<- replace_values(hotel_bookings, meal, meal_replacements)
hotel_bookings unique(hotel_bookings$meal)
[1] "Bed and Breakfast" "Full Board" "Half Board"
[4] "Self Catering" "Undefined"
# Replace market_segment values
<- replace_values(hotel_bookings, market_segment, market_segment_replacements)
hotel_bookings unique(hotel_bookings$market_segment)
[1] "Direct" "Corporate"
[3] "Online Travel Agent" "Offline Travel Agent/Tour Operator"
[5] "Complementary" "Groups"
[7] "Undefined" "Aviation"