Challenge 3 Solution

challenge_3
animal_weights
eggs
australian_marriage
usa_households
sce_labor
Susannah Reed Poland
Tidy Data: Pivoting
Author

Susannah Reed Poland

Published

June 8, 2023

Code
library(tidyverse)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Challenge Overview

Today’s challenge is to:

  1. read in a data set, and describe the data set using both words and any supporting information (e.g., tables, etc)
  2. identify what needs to be done to tidy the current data
  3. anticipate the shape of pivoted data
  4. pivot the data into tidy format using pivot_longer

Read in data

Read in the following dataset using the correct R package and command.

  • animal_weights.csv ⭐

I have renamed this dataset “animalweight” for reference.

Code
library(tidyverse)
animalweight<-read_csv("_data/animal_weight.csv")
animalweight

Briefly describe the data

From inspection, the “animalweight” dataframe contains data on the average weightage of 13 animals in 9 global regions defined by the IPCC (International Panel on Climate Change). The animals seem to be livestock. Three of the animals have been subdivided into two groups, so there are a total of 16 variables across 9 rows. Each value is a the average weight in kilograms.

These data are currently in a wide format, with a total of 144 unique cases. To tidy this dataframe, we will transform the dataframe such that each row will represent a unique case, with 3 variables: region, animal type, and average animal weight in kilograms.

Challenge: Describe the final dimensions

Because there are 144 unique cases (16 animal times * 9 regions), we will expect to see 144 (check out my in-line r code!) rows with 3 columns representing the 3 variables.

Code
# existing rows
nrow(animalweight)
[1] 9
Code
# existing columns
ncol(animalweight)
[1] 17
Code
#expected rows=cases 
nrow(animalweight)*(ncol(animalweight)-1)
[1] 144

Pivot the Data

Code
animalweightlonger<-pivot_longer(animalweight, col = !'IPCC Area', names_to = "Animal_Type", values_to = "Weight_KG")
animalweightlonger

In this longer table, each case is represented in a row, which represents a unique combination of IPCC region and Animal Type. It represents a tidy dataframe because each row is a unique case.