Code
library(tidyverse)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Tenzin Latoe
July 7, 2023
Today’s challenge is to:
pivot_longer
Read in one (or more) of the following datasets, using the correct R package and command.
# A tibble: 6 × 6
month year large_half_dozen large_dozen extra_large_half_dozen
<chr> <dbl> <dbl> <dbl> <dbl>
1 January 2004 126 230 132
2 February 2004 128. 226. 134.
3 March 2004 131 225 137
4 April 2004 131 225 137
5 May 2004 131 225 137
6 June 2004 134. 231. 137
# ℹ 1 more variable: extra_large_dozen <dbl>
The Eggs data set shows the price of large half dozen, large dozen, extra large half dozen, and extra large dozen eggs. The report breaks down the results per month from January 2004 till December 2013. I plan to pivot the data to reorganize the structure of the data so that it can be tidier for analysis.
The end result after pivoting the data will condense the four different different types of eggs under one column which is result in the data set with less columns, and more rows.
[1] 120 6
[1] 120
[1] 6
The data consists of 6 rows and 120 columns.
Now we will pivot the data, and compare our pivoted data dimensions to the dimensions calculated above as a “sanity” check.
Document your work here. What will a new “case” be once you have pivoted the data? How does it meet requirements for tidy data?
# A tibble: 480 × 4
month year package price
<chr> <dbl> <chr> <dbl>
1 January 2004 large_half_dozen 126
2 January 2004 large_dozen 230
3 January 2004 extra_large_half_dozen 132
4 January 2004 extra_large_dozen 230
5 February 2004 large_half_dozen 128.
6 February 2004 large_dozen 226.
7 February 2004 extra_large_half_dozen 134.
8 February 2004 extra_large_dozen 230
9 March 2004 large_half_dozen 131
10 March 2004 large_dozen 225
# ℹ 470 more rows
Running the code above resulted in transforming the data set longer by gathering the four different types of eggs under a single column renamed “package”, and creating a new column named “price” which gathered all the the prices under that column.
---
title: "Challenge 3"
author: "Tenzin Latoe"
description: "Tidy Data: Pivoting"
date: "07/07/2023"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_3
- Tenzin Latoe
- eggs
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
## Challenge Overview
Today's challenge is to:
1. read in a data set, and describe the data set using both words and any supporting information (e.g., tables, etc)
2. identify what needs to be done to tidy the current data
3. anticipate the shape of pivoted data
4. pivot the data into tidy format using `pivot_longer`
## Read in data
Read in one (or more) of the following datasets, using the correct R package and command.
- eggs_tidy.csv ⭐⭐
```{r}
Eggs <- read_csv("_data/eggs_tidy.csv")
head(Eggs)
```
### Briefly describe the data
The Eggs data set shows the price of large half dozen, large dozen, extra large half dozen, and extra large dozen eggs. The report breaks down the results per month from January 2004 till December 2013.
I plan to pivot the data to reorganize the structure of the data so that it can be tidier for analysis.
## Anticipate the End Result
The end result after pivoting the data will condense the four different different types of eggs under one column which is result in the data set with less columns, and more rows.
### Challenge: Describe the final dimensions
```{r}
#Dimensions
dim(Eggs)
#existing rows/cases
nrow(Eggs)
#existing columns/cases
ncol(Eggs)
```
The data consists of 6 rows and 120 columns.
## Pivot the Data
Now we will pivot the data, and compare our pivoted data dimensions to the dimensions calculated above as a "sanity" check.
### Challenge: Pivot the Chosen Data
Document your work here. What will a new "case" be once you have pivoted the data? How does it meet requirements for tidy data?
```{r}
longer<- pivot_longer(Eggs, cols=c(-month,-year),
names_to = "package",
values_to = "price")
longer
```
Running the code above resulted in transforming the data set longer by gathering the four different types of eggs under a single column renamed "package", and creating a new column named "price" which gathered all the the prices under that column.