Challenge 3

challenge_2

Matt Eckstein

animal_weight.csv

Author

Matt Eckstein

Published

March 17, 2023

Code

library(tidyverse)

knitr::opts_chunk$set(echo = TRUE)

Read in data

Code

animals <- read.csv("_data/animal_weight.csv")

head(animals)

            IPCC.Area Cattle...dairy Cattle...non.dairy Buffaloes
1 Indian Subcontinent            275                110       295
2      Eastern Europe            550                391       380
3              Africa            275                173       380
4             Oceania            500                330       380
5      Western Europe            600                420       380
6       Latin America            400                305       380
  Swine...market Swine...breeding Chicken...Broilers Chicken...Layers Ducks
1             28               28                0.9              1.8   2.7
2             50              180                0.9              1.8   2.7
3             28               28                0.9              1.8   2.7
4             45              180                0.9              1.8   2.7
5             50              198                0.9              1.8   2.7
6             28               28                0.9              1.8   2.7
  Turkeys Sheep Goats Horses Asses Mules Camels Llamas
1     6.8  28.0  30.0    238   130   130    217    217
2     6.8  48.5  38.5    377   130   130    217    217
3     6.8  28.0  30.0    238   130   130    217    217
4     6.8  48.5  38.5    377   130   130    217    217
5     6.8  48.5  38.5    377   130   130    217    217
6     6.8  28.0  30.0    238   130   130    217    217

Code

summarize(animals)

data frame with 0 columns and 1 row

Briefly describe the data and Anticipate the End Result

This data describes the average weights of common types of livestock across regions of the world. Its 17 columns make it somewhat difficult to read, and it could be more legible if it were grouped with only 3 columns and location-livestock type pairs as cases.

Find current and future data dimensions

Code

nrow(animals)

[1] 9

Code

ncol(animals)

[1] 17

There are 17 columns, 16 of which are animals (variables) and not the column containing the names of the observations.

There are 9 observations of 17 variables. I need 1 variable to identify a case, and there will be n * (k - number of variables used to identify a case) rows in the result. 9 * (17-1) = 144. So, we expect the result of our pivoting to have 144 rows.

Code

nrow(animals) * (ncol(animals)-1)

[1] 144

Pivot the Data

Code

animals2 <- pivot_longer(animals, `Cattle...dairy`:`Llamas`, names_to = "type", values_to = "weights")

Describe the final dimensions

Code

nrow(animals2)

[1] 144

Code

ncol(animals2)

[1] 3

The final table does in fact have 144 rows and 3 columns.

New cases and what makes the new data tidy

Now that the data is pivoted, a case is a pairing of an IPCC area and an animal type. This data is tidy because every variable (IPCC area, animal type, and weight) is a column, and every observation (an area-type pairing) is a row.