Code
library(tidyverse)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)Jack Sniezek
December 1, 2022
Today’s challenge is to:
pivot_longer# A tibble: 120 × 6
   month      year large_halfdozen large_dozen xlarge_halfdozen xlarge_dozen
   <chr>     <dbl>           <dbl>       <dbl>            <dbl>        <dbl>
 1 January    2004            126         230              132          230 
 2 February   2004            128.        226.             134.         230 
 3 March      2004            131         225              137          230 
 4 April      2004            131         225              137          234.
 5 May        2004            131         225              137          236 
 6 June       2004            134.        231.             137          241 
 7 July       2004            134.        234.             137          241 
 8 August     2004            134.        234.             137          241 
 9 September  2004            130.        234.             136.         241 
10 October    2004            128.        234.             136.         241 
# … with 110 more rows    month                year      large_halfdozen  large_dozen   
 Length:120         Min.   :2004   Min.   :126.0   Min.   :225.0  
 Class :character   1st Qu.:2006   1st Qu.:129.4   1st Qu.:233.5  
 Mode  :character   Median :2008   Median :174.5   Median :267.5  
                    Mean   :2008   Mean   :155.2   Mean   :254.2  
                    3rd Qu.:2011   3rd Qu.:174.5   3rd Qu.:268.0  
                    Max.   :2013   Max.   :178.0   Max.   :277.5  
 xlarge_halfdozen  xlarge_dozen  
 Min.   :132.0    Min.   :230.0  
 1st Qu.:135.8    1st Qu.:241.5  
 Median :185.5    Median :285.5  
 Mean   :164.2    Mean   :266.8  
 3rd Qu.:185.5    3rd Qu.:285.5  
 Max.   :188.1    Max.   :290.0  After reading in the eggs dataset, I can see that there are 120 rows that contain each month from 2004-2013. There are 6 columns that represent the month and year, as well as average egg prices for 4 types/quantities of eggs.
On the read in, I also renamed the columns to keep the size and quantity of eggs separate, which will help me pivot the data.
Right now the data consists of 6 columns, 4 of which contain values and 2 categorize the data. To make the data easier to work with, I want to make one column with values(Price) and add a column for size and quantity of eggs. So, my new matrix will contain the month, year, size, quantity, and price. I also anticipate that there will be 480 rows, as I will be putting all the price values into one column (120 months x 4 price variables).
# A tibble: 480 × 5
   month     year size   quantity  price
   <chr>    <dbl> <chr>  <chr>     <dbl>
 1 January   2004 large  halfdozen  126 
 2 January   2004 large  dozen      230 
 3 January   2004 xlarge halfdozen  132 
 4 January   2004 xlarge dozen      230 
 5 February  2004 large  halfdozen  128.
 6 February  2004 large  dozen      226.
 7 February  2004 xlarge halfdozen  134.
 8 February  2004 xlarge dozen      230 
 9 March     2004 large  halfdozen  131 
10 March     2004 large  dozen      225 
# … with 470 more rowsThe data matches my prediction, as I now have 480 rows and 5 columns. The data is now organized so that there is one column that contains all the price values.
---
title: "Challenge 3"
author: "Jack Sniezek"
desription: "Tidy Data: Pivoting"
date: "12/1/2022"
format:
  html:
    toc: true
    code-fold: true
    code-copy: true
    code-tools: true
categories:
  - challenge_3
  - animal_weights
  - eggs
  - australian_marriage
  - usa_households
  - sce_labor
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
## Challenge Overview
Today's challenge is to:
1.  read in a data set, and describe the data set using both words and any supporting information (e.g., tables, etc)
2.  identify what needs to be done to tidy the current data
3.  anticipate the shape of pivoted data
4.  pivot the data into tidy format using `pivot_longer`
## Read in data
-   eggs_tidy.csv ⭐⭐ 
```{r}
eggs <- read_csv("_data/eggs_tidy.csv")%>%
    rename("xlarge_halfdozen" = "extra_large_half_dozen", 
         "xlarge_dozen" = "extra_large_dozen", 
         "large_halfdozen" = "large_half_dozen")
eggs
summary(eggs)
```
## Briefly describe the data
After reading in the eggs dataset, I can see that there are 120 rows that contain each month from 2004-2013. There are 6 columns that represent the month and year, as well as average egg prices for 4 types/quantities of eggs.
On the read in, I also renamed the columns to keep the size and quantity of eggs separate, which will help me pivot the data.
## Anticipate the End Result
Right now the data consists of 6 columns, 4 of which contain values and 2 categorize the data. To make the data easier to work with, I want to make one column with values(Price) and add a column for size and quantity of eggs. So, my new matrix will contain the month, year, size, quantity, and price. I also anticipate that there will be 480 rows, as I will be putting all the price values into one column (120 months x 4 price variables).
## Pivot the Data
```{r}
eggs_longer <- eggs %>%
   pivot_longer(cols = contains("large"),
               names_to = c("size", "quantity"),
               names_sep = "_",
               values_to = "price")
eggs_longer
```
The data matches my prediction, as I now have 480 rows and 5 columns. The data is now organized so that there is one column that contains all the price values.