Code
library(tidyverse)
library(summarytools)
library(readr)
library(readxl)
library(lubridate)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Cam Needels
March 27, 2023
# A tibble: 600 × 4
Product Year Month Price_Dollar
<chr> <dbl> <chr> <dbl>
1 Whole 2013 January 2.38
2 Whole 2013 February 2.38
3 Whole 2013 March 2.38
4 Whole 2013 April 2.38
5 Whole 2013 May 2.38
6 Whole 2013 June 2.38
7 Whole 2013 July 2.38
8 Whole 2013 August 2.38
9 Whole 2013 September 2.38
10 Whole 2013 October 2.38
# … with 590 more rows
[1] "Whole" "B/S Breast" "Bone-in Breast" "Whole Legs"
[5] "Thighs"
[1] -2013 -2012 -2011 -2010 -2009 -2008 -2007 -2006 -2005 -2004
It shows the year and month when different forms of chicken were purchased. Whether it’s Whole, B/S Breast, Bone-in Breast, Whole Legs, or Thighs and the amount of price per dollar.These ranges from the year 2004 to 2013.
The data is already tidy so I don’t need to make changes. However I do need to figure out the variables in the product category.
Product and Month are not numeric or double so we have to convert them into numbers so we can analyze the data more in depth. We will do this by creating a date column by taking the year and month to make a date column. We will also recode the products so that they can be changed into numbers and be able to be used for data analysis. I will use dfsummary afterwards and here are the results.
#convert Month -> numbers
eggs_mutate <- eggs_data %>%
mutate(Month_num = recode(Month, "January" = 1, "February" = 2, "March" = 3, "April" = 4, "May" = 5, "June" = 6, "July" = 7, "August" = 8, "September" = 9, "October" = 10, "November" = 11, "December" = 12))
#assigning IDs to chicken types
eggs_mutate <- eggs_mutate %>%
mutate(Chicken_ID = recode(Product, "B/S Breast" = 1,
"Bone-in Breast" = 2,
"Thighs" = 3,
"Whole" = 4,
"Whole Legs" = 5))
eggs_mutate
# A tibble: 600 × 6
Product Year Month Price_Dollar Month_num Chicken_ID
<chr> <dbl> <chr> <dbl> <dbl> <dbl>
1 Whole 2013 January 2.38 1 4
2 Whole 2013 February 2.38 2 4
3 Whole 2013 March 2.38 3 4
4 Whole 2013 April 2.38 4 4
5 Whole 2013 May 2.38 5 4
6 Whole 2013 June 2.38 6 4
7 Whole 2013 July 2.38 7 4
8 Whole 2013 August 2.38 8 4
9 Whole 2013 September 2.38 9 4
10 Whole 2013 October 2.38 10 4
# … with 590 more rows
Data Frame Summary
eggs_mutate
Dimensions: 600 x 6
Duplicates: 0
--------------------------------------------------------------------------------------------------------------
No Variable Stats / Values Freqs (% of Valid) Graph Valid Missing
---- -------------- -------------------------- -------------------- --------------------- ---------- ---------
1 Product 1. B/S Breast 120 (20.0%) IIII 600 0
[character] 2. Bone-in Breast 120 (20.0%) IIII (100.0%) (0.0%)
3. Thighs 120 (20.0%) IIII
4. Whole 120 (20.0%) IIII
5. Whole Legs 120 (20.0%) IIII
2 Year Mean (sd) : 2008.5 (2.9) 2004 : 60 (10.0%) II 600 0
[numeric] min < med < max: 2005 : 60 (10.0%) II (100.0%) (0.0%)
2004 < 2008.5 < 2013 2006 : 60 (10.0%) II
IQR (CV) : 5 (0) 2007 : 60 (10.0%) II
2008 : 60 (10.0%) II
2009 : 60 (10.0%) II
2010 : 60 (10.0%) II
2011 : 60 (10.0%) II
2012 : 60 (10.0%) II
2013 : 60 (10.0%) II
3 Month 1. April 50 ( 8.3%) I 600 0
[character] 2. August 50 ( 8.3%) I (100.0%) (0.0%)
3. December 50 ( 8.3%) I
4. February 50 ( 8.3%) I
5. January 50 ( 8.3%) I
6. July 50 ( 8.3%) I
7. June 50 ( 8.3%) I
8. March 50 ( 8.3%) I
9. May 50 ( 8.3%) I
10. November 50 ( 8.3%) I
[ 2 others ] 100 (16.7%) III
4 Price_Dollar Mean (sd) : 3.4 (1.7) 32 distinct values : 593 7
[numeric] min < med < max: : (98.8%) (1.2%)
1.9 < 2.4 < 7 :
IQR (CV) : 1.8 (0.5) : . .
: . : : .
5 Month_num Mean (sd) : 6.5 (3.5) 12 distinct values : : 600 0
[numeric] min < med < max: : : (100.0%) (0.0%)
1 < 6.5 < 12 : . . . . . . . . :
IQR (CV) : 5.5 (0.5) : : : : : : : : : :
: : : : : : : : : :
6 Chicken_ID Mean (sd) : 3 (1.4) 1 : 120 (20.0%) IIII 600 0
[numeric] min < med < max: 2 : 120 (20.0%) IIII (100.0%) (0.0%)
1 < 3 < 5 3 : 120 (20.0%) IIII
IQR (CV) : 2 (0.5) 4 : 120 (20.0%) IIII
5 : 120 (20.0%) IIII
--------------------------------------------------------------------------------------------------------------
---
title: "Challenge 4 submission"
author: "Cam Needels"
description: "More data wrangling: pivoting"
date: "03/27/2023"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_4
- eggs
- poultry
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
library(summarytools)
library(readr)
library(readxl)
library(lubridate)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
```{r}
#displaying the dataset
eggs_data<- read_csv("B:/Needels/Documents/DACCS 601/DACSS_601_New/posts/_data/poultry_tidy.csv")
eggs_data
```
```{r}
#this was done in order to figure out the unique data points for product
categories <- unique(eggs_data$Product)
categories
#this was done in order to figure out how many years this data set ranged from.
yearcat <-- unique(eggs_data$Year)
yearcat
```
### Briefly describe the data
It shows the year and month when different forms of chicken were purchased. Whether it's Whole, B/S Breast, Bone-in Breast, Whole Legs, or Thighs and the amount of price per dollar.These ranges from the year 2004 to 2013.
## Tidy Data (as needed)
The data is already tidy so I don't need to make changes. However I do need to figure out the variables in the product category.
## Identify variables that need to be mutated
Product and Month are not numeric or double so we have to convert them into numbers so we can analyze the data more in depth. We will do this by creating a date column by taking the year and month to make a date column. We will also recode the products so that they can be changed into numbers and be able to be used for data analysis. I will use dfsummary afterwards and here are the results.
```{r}
#convert Month -> numbers
eggs_mutate <- eggs_data %>%
mutate(Month_num = recode(Month, "January" = 1, "February" = 2, "March" = 3, "April" = 4, "May" = 5, "June" = 6, "July" = 7, "August" = 8, "September" = 9, "October" = 10, "November" = 11, "December" = 12))
#assigning IDs to chicken types
eggs_mutate <- eggs_mutate %>%
mutate(Chicken_ID = recode(Product, "B/S Breast" = 1,
"Bone-in Breast" = 2,
"Thighs" = 3,
"Whole" = 4,
"Whole Legs" = 5))
eggs_mutate
```
```{r}
dfSummary(eggs_mutate)
```