Challenge 10

challenge_10

purrr

Author

Noah Dixon

Published

July 6, 2023

Read Data

Using the read.csv function we can read the cereal.csv data into a data frame.

cereal <- read.csv("_data/cereal.csv")
cereal

Next, we will split the cereal dataframe based on cereal type.

cereal_types <- split(cereal, cereal$Type)
cereal_types

$A
                  Cereal Sodium Sugar Type
1    Frosted Mini Wheats      0    11    A
2            Raisin Bran    340    18    A
3               All Bran     70     5    A
8     Crackling Oat Bran    150    16    A
9              Fiber One    100     0    A
12 Honey Bunches of Oats    180     7    A
16          Honey Smacks     50    15    A
17             Special K    220     4    A
18              Wheaties    180     4    A
19           Corn Flakes    200     3    A

$C
                  Cereal Sodium Sugar Type
4            Apple Jacks    140    14    C
5         Captain Crunch    200    12    C
6               Cheerios    180     1    C
7  Cinnamon Toast Crunch    210    10    C
10        Frosted Flakes    130    12    C
11           Froot Loops    140    14    C
13    Honey Nut Cheerios    190     9    C
14                  Life    160     6    C
15         Rice Krispies    290     3    C
20             Honeycomb    210    11    C

Now, we will recreate my function from challenge 9 to calculate summary statistics for a variable. We will alter the function slightly to accept the column name as an argument along with the data frame.

statsFunction <- function(df, col_name) {
  column <- df[[col_name]]
  print(paste0("Summary Statistics:"))
  print(paste0("Maximum: ", max(column)))
  print(paste0("Minimum: ", min(column)))
  print(paste0("Mean: ", mean(column, na.rm = TRUE)))
  print(paste0("Median: ", median(column, na.rm = TRUE)))
  print(paste0("Standard Deviation: ", sd(column, na.rm = TRUE)))
}

Finally, we will use the map function from the purrr package to apply this function to the Sugar column of both data frames in the cereal_types list.

result <- map(cereal_types, ~statsFunction(.x, "Sugar"))

[1] "Summary Statistics:"
[1] "Maximum: 18"
[1] "Minimum: 0"
[1] "Mean: 8.3"
[1] "Median: 6"
[1] "Standard Deviation: 6.25477595299961"
[1] "Summary Statistics:"
[1] "Maximum: 14"
[1] "Minimum: 1"
[1] "Mean: 9.2"
[1] "Median: 10.5"
[1] "Standard Deviation: 4.49196814077947"