Code
library(tidyverse)
library(readxl)
library(lubridate)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Noah Dixon
June 13, 2023
Using the read_excel function we are able to read the data into a data frame.
The data shows the price of different poultry products (whole chicken, breast, bone-in breast, thigh, etc.) over the years 2004-2013. Each case gives the price for a specific product for a specific month during the time period.
This data is already tidy. Each column represents a variable and each row represents a case.
There are several variables that require mutation in this data set. The month and year columns can be combined and coded into dates, and the price_dollar column can be mutated to round the values to 2 decimal places since it is representing price in USD.
First, we will create a new column that combines the month and year column into a single column containing date objects. We can do this using the lubridate package. We will also drop the month and year columns as they are not necessary.
We now have a data frame that contains a Date column holding date objects. Therefore, we have successfully coded all time variables correctly as dates. Now we will round the Price_Dollar column to two decimal places.
We can see that the Price_Dollar column is rounded to two decimal places. The data has now been properly mutated and is more usable for analysis / visualization.
---
title: "Challenge 4 Instructions"
author: "Noah Dixon"
description: "More data wrangling: pivoting"
date: "6/13/2023"
format:
html:
df-print: paged
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_4
- poultry-tidy
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
library(readxl)
library(lubridate)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
## Read in Data
Using the read_excel function we are able to read the data into a data frame.
```{r}
#| label: read data
poultry_data_raw <- read_excel("_data/poultry_tidy.xlsx")
poultry_data_raw
```
## Describe the Data
The data shows the price of different poultry products (whole chicken, breast, bone-in breast, thigh, etc.) over the years 2004-2013. Each case gives the price for a specific product for a specific month during the time period.
## Tidy Data
This data is already tidy. Each column represents a variable and each row represents a case.
## Identify Variables that Need Mutation
There are several variables that require mutation in this data set. The month and year columns can be combined and coded into dates, and the price_dollar column can be mutated to round the values to 2 decimal places since it is representing price in USD.
## Mutate Variables
First, we will create a new column that combines the month and year column into a single column containing date objects. We can do this using the lubridate package. We will also drop the month and year columns as they are not necessary.
```{r}
#| label: mutate date
poultry_data <- poultry_data_raw %>%
mutate(Date = paste("1", match(Month, month.name), Year, sep = "-")) %>%
mutate(Date = dmy(Date)) %>%
select(Product, Date, Price_Dollar)
poultry_data
```
We now have a data frame that contains a Date column holding date objects. Therefore, we have successfully coded all time variables correctly as dates. Now we will round the Price_Dollar column to two decimal places.
```{r}
#| label: mutate price
poultry_data <- poultry_data %>%
mutate(Price_Dollar = round(Price_Dollar, 2))
poultry_data
```
We can see that the Price_Dollar column is rounded to two decimal places. The data has now been properly mutated and is more usable for analysis / visualization.