More data wrangling: pivoting
Author

Danny Holt

Published

June 12, 2023

Code
library(tidyverse)
library(readxl)
library(lubridate, warn.conflicts = FALSE)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Challenge Overview

Today’s challenge is to:

  1. identify variables that need to be mutated
  2. mutate variables and sanity check all mutations

Read in data

First, we’ll read in poultry_tidy.xlsx.

Code
  poultry <- read_excel("_data/poultry_tidy.xlsx")
  poultry
ABCDEFGHIJ0123456789
Product
<chr>
Year
<dbl>
Month
<chr>
Price_Dollar
<dbl>
Whole2013January2.38500
Whole2013February2.38500
Whole2013March2.38500
Whole2013April2.38500
Whole2013May2.38500
Whole2013June2.38500
Whole2013July2.38500
Whole2013August2.38500
Whole2013September2.38500
Whole2013October2.38500

Briefly describe the data

The data tells prices for different poultry cuts in months from 2004 to 2013. The prices appear to be per unit of weight (unit not specified).

Tidy Data (as needed)

The data is already tidy. Each variable forms a column and each observation forms a row. All values have their own cells.

Identify variables that need to be mutated

We can use mutate() and lubridate to create one properly coded date variable.

First, let’s combine Month and Year into one Date variable

Code
poultry <- poultry %>%
  # combine month and year
  mutate(date_prelim=str_c(`Month`," ",`Year`))

Next, we’ll use lubridate to create a properly coded Date variable.

Code
# create month year stamp
my_st<-stamp("02-2019", orders = "my")

poultry <- poultry %>%
  # recode string to date
  mutate(Date=my_st(my(date_prelim))) %>%
  # remove unnecessary variables
  select(Product,Date,Price_Dollar)
poultry
ABCDEFGHIJ0123456789
Product
<chr>
Date
<chr>
Price_Dollar
<dbl>
Whole01-20132.38500
Whole02-20132.38500
Whole03-20132.38500
Whole04-20132.38500
Whole05-20132.38500
Whole06-20132.38500
Whole07-20132.38500
Whole08-20132.38500
Whole09-20132.38500
Whole10-20132.38500