Challenge 4 Instructions

challenge_4
poultry-tidy
More data wrangling: pivoting
Author

Noah Dixon

Published

June 13, 2023

Code
library(tidyverse)
library(readxl)
library(lubridate)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Read in Data

Using the read_excel function we are able to read the data into a data frame.

Code
poultry_data_raw <- read_excel("_data/poultry_tidy.xlsx")
poultry_data_raw

Describe the Data

The data shows the price of different poultry products (whole chicken, breast, bone-in breast, thigh, etc.) over the years 2004-2013. Each case gives the price for a specific product for a specific month during the time period.

Tidy Data

This data is already tidy. Each column represents a variable and each row represents a case.

Identify Variables that Need Mutation

There are several variables that require mutation in this data set. The month and year columns can be combined and coded into dates, and the price_dollar column can be mutated to round the values to 2 decimal places since it is representing price in USD.

Mutate Variables

First, we will create a new column that combines the month and year column into a single column containing date objects. We can do this using the lubridate package. We will also drop the month and year columns as they are not necessary.

Code
poultry_data <- poultry_data_raw %>%
  mutate(Date = paste("1", match(Month, month.name), Year, sep = "-")) %>%
  mutate(Date = dmy(Date)) %>%
  select(Product, Date, Price_Dollar)
poultry_data

We now have a data frame that contains a Date column holding date objects. Therefore, we have successfully coded all time variables correctly as dates. Now we will round the Price_Dollar column to two decimal places.

Code
poultry_data <- poultry_data %>%
  mutate(Price_Dollar = round(Price_Dollar, 2))
poultry_data

We can see that the Price_Dollar column is rounded to two decimal places. The data has now been properly mutated and is more usable for analysis / visualization.