Challenge 3

challenge_3

Author

Tyler Tewksbury

Published

August 25, 2022

Code

library(tidyverse)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Challenge Overview

Read in data

Code

eggs <-read_csv("_data/eggs_tidy.csv",
                        show_col_types = FALSE)

Briefly describe the data

The dataset is pre-tidy, with 120 rows and 6 columns. The data shows the price of eggs based on their size across different months and years.

Challenge: Describe the final dimensions

Code

nrow(eggs)

[1] 120

Code

ncol(eggs)

[1] 6

Code

nrow(eggs) * (ncol(eggs)-2)

[1] 480

The dataset has 120 rows and 6 columns. Because there are two grouping variables, in the nrow - ncol calculation we subtract 2 from col. This gives 480, the amount of expected rows when pivoting the dataset longer.

Challenge: Pivot the Chosen Data

Code

long_eggs <- eggs%>%
  pivot_longer(cols=contains ("large"),
               names_to = c("size", "quantity"),
               names_sep="_",
               values_to = "price")

In the long dataset, are now new cases that show the price per size and quantity. There are 4 identifiers/category variables (two more than the previous dataset) and 1 value per row, which makes the dataset far easier to work with and simply look at. Visualizations and other analysis can be done now without unnecessary steps in each simple analysis, because the data now has that 1 value per row.