challenge_3
animal_weight
Tidy Data: Pivoting
Author

Ananya Pujary

Published

August 17, 2022

Code
library(tidyverse)

library(skimr)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Read in data

I’ll be reading in the ‘animal_weight’ dataset.

Code
animal_weight<-read_csv("_data/animal_weight.csv",
                        show_col_types = FALSE)

Briefly describe the data

Code
dim(animal_weight)
[1]  9 17
Code
colnames(animal_weight) 
 [1] "IPCC Area"          "Cattle - dairy"     "Cattle - non-dairy"
 [4] "Buffaloes"          "Swine - market"     "Swine - breeding"  
 [7] "Chicken - Broilers" "Chicken - Layers"   "Ducks"             
[10] "Turkeys"            "Sheep"              "Goats"             
[13] "Horses"             "Asses"              "Mules"             
[16] "Camels"             "Llamas"            

The data chosen seems to describe the average weights of different animals (dairy and non-dairy cattle, chickens, ducks, etc.) across global regions (Africa, Latin America, Middle East, etc.). It has 9 rows and 17 columns, of which the \(n=9\) rows indicate the region name and \(k=16\) columns the type of animal.

Code
skim(animal_weight)
Data summary
Name animal_weight
Number of rows 9
Number of columns 17
_______________________
Column type frequency:
character 1
numeric 16
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
IPCC Area 0 1 4 19 0 9 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
Cattle - dairy 0 1 425.44 140.39 275.0 275.0 400.0 550.0 604.0 ▇▅▁▂▇
Cattle - non-dairy 0 1 298.00 116.26 110.0 173.0 330.0 391.0 420.0 ▂▃▁▃▇
Buffaloes 0 1 370.56 28.33 295.0 380.0 380.0 380.0 380.0 ▁▁▁▁▇
Swine - market 0 1 39.22 10.79 28.0 28.0 45.0 50.0 50.0 ▇▁▁▂▇
Swine - breeding 0 1 116.44 84.19 28.0 28.0 180.0 180.0 198.0 ▆▁▁▁▇
Chicken - Broilers 0 1 0.90 0.00 0.9 0.9 0.9 0.9 0.9 ▁▁▇▁▁
Chicken - Layers 0 1 1.80 0.00 1.8 1.8 1.8 1.8 1.8 ▁▁▇▁▁
Ducks 0 1 2.70 0.00 2.7 2.7 2.7 2.7 2.7 ▁▁▇▁▁
Turkeys 0 1 6.80 0.00 6.8 6.8 6.8 6.8 6.8 ▁▁▇▁▁
Sheep 0 1 39.39 10.80 28.0 28.0 48.5 48.5 48.5 ▆▁▁▁▇
Goats 0 1 34.72 4.48 30.0 30.0 38.5 38.5 38.5 ▆▁▁▁▇
Horses 0 1 315.22 73.26 238.0 238.0 377.0 377.0 377.0 ▆▁▁▁▇
Asses 0 1 130.00 0.00 130.0 130.0 130.0 130.0 130.0 ▁▁▇▁▁
Mules 0 1 130.00 0.00 130.0 130.0 130.0 130.0 130.0 ▁▁▇▁▁
Camels 0 1 217.00 0.00 217.0 217.0 217.0 217.0 217.0 ▁▁▇▁▁
Llamas 0 1 217.00 0.00 217.0 217.0 217.0 217.0 217.0 ▁▁▇▁▁

There are no missing values in this dataset. Overall, dairy cattle seem to have the highest average weight (425.44) and broiler chickens have the lowest (0.9).

I plan to pivot it because it seems that the selected animals are recurring categories in all of the regions. \(k-3\) variables will be pivoted and put in a new column.

Hence, the pivoted dataset would have 144 rows and 3 columns (‘IPCC Area’,‘Animal Type’, ‘Weight’).

Challenge: Describe the final dimensions

Code
# existing rows/cases
nrow(animal_weight)
[1] 9
Code
# existing columns/cases
ncol(animal_weight)
[1] 17
Code
#expected rows/cases
nrow(animal_weight) * (ncol(animal_weight)-1)
[1] 144
Code
# expected columns 
1 + 2
[1] 3

There are 9 existing rows and 17 existing columns. The expected rows are 144 and expected columns are 3.

Challenge: Pivot the Chosen Data

Code
animal_weight_pivoted <- pivot_longer(animal_weight,
                         col = c('Cattle - dairy', 'Cattle - non-dairy', 'Buffaloes', 'Swine - market', 'Swine - breeding', 'Chicken - Broilers', 'Chicken - Layers', 'Ducks', 'Turkeys', 'Sheep', 'Goats', 'Horses', 'Asses', 'Mules', 'Camels', 'Llamas'), names_to = 'Animal Type', values_to = 'Weight')
animal_weight_pivoted
# A tibble: 144 × 3
   `IPCC Area`         `Animal Type`      Weight
   <chr>               <chr>               <dbl>
 1 Indian Subcontinent Cattle - dairy      275  
 2 Indian Subcontinent Cattle - non-dairy  110  
 3 Indian Subcontinent Buffaloes           295  
 4 Indian Subcontinent Swine - market       28  
 5 Indian Subcontinent Swine - breeding     28  
 6 Indian Subcontinent Chicken - Broilers    0.9
 7 Indian Subcontinent Chicken - Layers      1.8
 8 Indian Subcontinent Ducks                 2.7
 9 Indian Subcontinent Turkeys               6.8
10 Indian Subcontinent Sheep                28  
# … with 134 more rows
# ℹ Use `print(n = ...)` to see more rows
Code
dim(animal_weight_pivoted)
[1] 144   3

The dimensions of the pivoted data, as predicted, are 144 rows and 3 columns. The new case created is ‘Animal Type’. Overall, pivoting made the data easier to understand since we can now find the weight of a certain animal from a particular region.