Code
library(tidyverse)
library(lubridate)
::opts_chunk$set(echo = TRUE) knitr
Daniel Hannon
March 29, 2023
# A tibble: 600 × 4
Product Year Month Price_Dollar
<chr> <dbl> <chr> <dbl>
1 Whole 2013 January 2.38
2 Whole 2013 February 2.38
3 Whole 2013 March 2.38
4 Whole 2013 April 2.38
5 Whole 2013 May 2.38
6 Whole 2013 June 2.38
7 Whole 2013 July 2.38
8 Whole 2013 August 2.38
9 Whole 2013 September 2.38
10 Whole 2013 October 2.38
# … with 590 more rows
Data Frame Summary
poultry_data
Dimensions: 600 x 4
Duplicates: 0
--------------------------------------------------------------------------------------------------------------
No Variable Stats / Values Freqs (% of Valid) Graph Valid Missing
---- -------------- -------------------------- -------------------- --------------------- ---------- ---------
1 Product 1. B/S Breast 120 (20.0%) IIII 600 0
[character] 2. Bone-in Breast 120 (20.0%) IIII (100.0%) (0.0%)
3. Thighs 120 (20.0%) IIII
4. Whole 120 (20.0%) IIII
5. Whole Legs 120 (20.0%) IIII
2 Year Mean (sd) : 2008.5 (2.9) 2004 : 60 (10.0%) II 600 0
[numeric] min < med < max: 2005 : 60 (10.0%) II (100.0%) (0.0%)
2004 < 2008.5 < 2013 2006 : 60 (10.0%) II
IQR (CV) : 5 (0) 2007 : 60 (10.0%) II
2008 : 60 (10.0%) II
2009 : 60 (10.0%) II
2010 : 60 (10.0%) II
2011 : 60 (10.0%) II
2012 : 60 (10.0%) II
2013 : 60 (10.0%) II
3 Month 1. April 50 ( 8.3%) I 600 0
[character] 2. August 50 ( 8.3%) I (100.0%) (0.0%)
3. December 50 ( 8.3%) I
4. February 50 ( 8.3%) I
5. January 50 ( 8.3%) I
6. July 50 ( 8.3%) I
7. June 50 ( 8.3%) I
8. March 50 ( 8.3%) I
9. May 50 ( 8.3%) I
10. November 50 ( 8.3%) I
[ 2 others ] 100 (16.7%) III
4 Price_Dollar Mean (sd) : 3.4 (1.7) 32 distinct values : 593 7
[numeric] min < med < max: : (98.8%) (1.2%)
1.9 < 2.4 < 7 :
IQR (CV) : 1.8 (0.5) : . .
: . : : .
--------------------------------------------------------------------------------------------------------------
# A tibble: 7 × 4
Product Year Month Price_Dollar
<chr> <dbl> <chr> <dbl>
1 Bone-in Breast 2004 January NA
2 Bone-in Breast 2004 February NA
3 Bone-in Breast 2004 March NA
4 Bone-in Breast 2004 April NA
5 Bone-in Breast 2004 May NA
6 Bone-in Breast 2004 June NA
7 Thighs 2004 January NA
This data set describes the cost of 5 Various poultry cuts, (Boneless Skinless Breast, Bone-in Breast, Thighs, Whole Legs, and Whole), each month from January 2004 to December 2013. The data is missing several prices from 2004: Thighs from January, and Bone-in Breast from January to June.
The Data is already in a Tidy format where each row is a singular observation of a price of a certain cut of meat from a specific month and year.
Right now the data has separate months and year columns so we need to add a date column so that we can sort things chronologically.
# A tibble: 6 × 5
Product Year Month Price_Dollar Date
<chr> <dbl> <chr> <dbl> <date>
1 Whole 2013 January 2.38 2013-01-01
2 Whole 2013 February 2.38 2013-02-01
3 Whole 2013 March 2.38 2013-03-01
4 Whole 2013 April 2.38 2013-04-01
5 Whole 2013 May 2.38 2013-05-01
6 Whole 2013 June 2.38 2013-06-01
Now we have a column set up with dates, although the dates all have the day set to the first. We don’t know the actual day that data was collected, but becasuse it is consistent throught the data_set, it wont mess up the ordering of anything.
---
title: "Pivioting Poultry"
author: "Daniel Hannon"
desription: "Mutated and described the Poultry dataset"
date: "03/29/2023"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_4
- Daniel Hannon
- poultry_tidy
---
```{r}
#| label: setup
#| warning: false
library(tidyverse)
library(lubridate)
knitr::opts_chunk$set(echo = TRUE)
```
## Read in the Data
```{r}
poultry_data <-readxl::read_excel("_data/poultry_tidy.xlsx")
poultry_data
summarytools::dfSummary(poultry_data)
missing_data <- filter(poultry_data, is.na(Price_Dollar))
missing_data
```
## Describe the Data
This data set describes the cost of 5 Various poultry cuts, (Boneless Skinless Breast, Bone-in Breast, Thighs, Whole Legs, and Whole), each month from January 2004 to December 2013. The data is missing several prices from 2004: Thighs from January, and Bone-in Breast from January to June.
## Tidy the Data
The Data is already in a Tidy format where each row is a singular observation of a price of a certain cut of meat from a specific month and year.
## Mutate the Date
Right now the data has separate months and year columns so we need to add a date column so that we can sort things chronologically.
```{r}
poultry_data <- poultry_data %>%
mutate(Date = ym(paste(Year, Month)))
head(poultry_data)
```
Now we have a column set up with dates, although the dates all have the day set to the first. We don't know the actual day that data was collected, but becasuse it is consistent throught the data_set, it wont mess up the ordering of anything.