Challenge 10 Solution

challenge_10
purrr
Author

Shreya Varma

Published

May 30, 2023

Challenge Overview

The purrr package is a powerful tool for functional programming. It allows the user to apply a single function across multiple objects. It can replace for loops with a more readable (and often faster) simple function call.

For example, we can draw n random samples from 10 different distributions using a vector of 10 means.

n <- 100 # sample size
m <- seq(1,10) # means 
samps <- map(m,rnorm,n=n) 

We can then use map_dbl to verify that this worked correctly by computing the mean for each sample.

samps %>%
  map_dbl(mean)
 [1] 1.052554 1.905349 3.034037 4.038838 5.151351 6.094199 7.032789 7.970365
 [9] 8.977754 9.949969

purrr is tricky to learn (but beyond useful once you get a handle on it). Therefore, it’s imperative that you complete the purr and map readings before attempting this challenge.

The challenge

Use purrr with a function to perform some data science task. What this task is is up to you. It could involve computing summary statistics, reading in multiple datasets, running a random process multiple times, or anything else you might need to do in your work as a data analyst. You might consider using purrr with a function you wrote for challenge 9.

I am using map_chr() from purrr along with my function that I created in challenge 9 as suggested. I am replacing multiple LOV’s in columns market_segment and meal with values that make the dataset more readable.

hotel_bookings <- read_csv("_data/hotel_bookings.csv")
head(hotel_bookings)
# A tibble: 6 × 32
  hotel        is_canceled lead_time arrival_date_year arrival_date_month
  <chr>              <dbl>     <dbl>             <dbl> <chr>             
1 Resort Hotel           0       342              2015 July              
2 Resort Hotel           0       737              2015 July              
3 Resort Hotel           0         7              2015 July              
4 Resort Hotel           0        13              2015 July              
5 Resort Hotel           0        14              2015 July              
6 Resort Hotel           0        14              2015 July              
# ℹ 27 more variables: arrival_date_week_number <dbl>,
#   arrival_date_day_of_month <dbl>, stays_in_weekend_nights <dbl>,
#   stays_in_week_nights <dbl>, adults <dbl>, children <dbl>, babies <dbl>,
#   meal <chr>, country <chr>, market_segment <chr>,
#   distribution_channel <chr>, is_repeated_guest <dbl>,
#   previous_cancellations <dbl>, previous_bookings_not_canceled <dbl>,
#   reserved_room_type <chr>, assigned_room_type <chr>, …
unique(hotel_bookings$meal)
[1] "BB"        "FB"        "HB"        "SC"        "Undefined"
unique(hotel_bookings$market_segment)
[1] "Direct"        "Corporate"     "Online TA"     "Offline TA/TO"
[5] "Complementary" "Groups"        "Undefined"     "Aviation"     
# Define the replacement mappings
meal_replacements <- c("BB" = "Bed and Breakfast",
                       "HB" = "Half Board",
                       "FB" = "Full Board",
                       "SC" = "Self Catering")

market_segment_replacements <- c("Online TA" = "Online Travel Agent",
                                 "Offline TA/TO" = "Offline Travel Agent/Tour Operator")

# Function to replace values
replace_values <- function(data, column_name, replacements) {
  data %>%
    mutate({{column_name}} := map_chr({{column_name}}, ~ ifelse(.x %in% names(replacements), replacements[.x], .x)))
}

# Replace meal values
hotel_bookings <- replace_values(hotel_bookings, meal, meal_replacements)
unique(hotel_bookings$meal)
[1] "Bed and Breakfast" "Full Board"        "Half Board"       
[4] "Self Catering"     "Undefined"        
# Replace market_segment values
hotel_bookings <- replace_values(hotel_bookings, market_segment, market_segment_replacements)
unique(hotel_bookings$market_segment)
[1] "Direct"                             "Corporate"                         
[3] "Online Travel Agent"                "Offline Travel Agent/Tour Operator"
[5] "Complementary"                      "Groups"                            
[7] "Undefined"                          "Aviation"