Code
library(tidyverse)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Siddharth Goel
January 27, 2023
# A tibble: 9 × 17
IPCC A…¹ Cattl…² Cattl…³ Buffa…⁴ Swine…⁵ Swine…⁶ Chick…⁷ Chick…⁸ Ducks Turkeys
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Indian … 275 110 295 28 28 0.9 1.8 2.7 6.8
2 Eastern… 550 391 380 50 180 0.9 1.8 2.7 6.8
3 Africa 275 173 380 28 28 0.9 1.8 2.7 6.8
4 Oceania 500 330 380 45 180 0.9 1.8 2.7 6.8
5 Western… 600 420 380 50 198 0.9 1.8 2.7 6.8
6 Latin A… 400 305 380 28 28 0.9 1.8 2.7 6.8
7 Asia 350 391 380 50 180 0.9 1.8 2.7 6.8
8 Middle … 275 173 380 28 28 0.9 1.8 2.7 6.8
9 Norther… 604 389 380 46 198 0.9 1.8 2.7 6.8
# … with 7 more variables: Sheep <dbl>, Goats <dbl>, Horses <dbl>, Asses <dbl>,
# Mules <dbl>, Camels <dbl>, Llamas <dbl>, and abbreviated variable names
# ¹`IPCC Area`, ²`Cattle - dairy`, ³`Cattle - non-dairy`, ⁴Buffaloes,
# ⁵`Swine - market`, ⁶`Swine - breeding`, ⁷`Chicken - Broilers`,
# ⁸`Chicken - Layers`
cols(
`IPCC Area` = col_character(),
`Cattle - dairy` = col_double(),
`Cattle - non-dairy` = col_double(),
Buffaloes = col_double(),
`Swine - market` = col_double(),
`Swine - breeding` = col_double(),
`Chicken - Broilers` = col_double(),
`Chicken - Layers` = col_double(),
Ducks = col_double(),
Turkeys = col_double(),
Sheep = col_double(),
Goats = col_double(),
Horses = col_double(),
Asses = col_double(),
Mules = col_double(),
Camels = col_double(),
Llamas = col_double()
)
The input dataset describes the weights of different animals and the overall category-wise weights for 9 IPCC areas. For this dataset, I plan to move the animals to rows and the areas to columns using pivot_longer so that it is easy to get statistics based on specific animals and categories.
Document your work here.
[1] 144
[1] 3
The dataset currently has 9 rows and 17 columns. When we pivot the dataset, the targeted rows and columns will be 144 rows and 3 columns (Area, Animal and Weight).
Now we will pivot the data, and compare our pivoted data dimensions to the dimensions calculated above as a “sanity” check.
# A tibble: 6 × 3
`IPCC Area` Animal Weight
<chr> <chr> <dbl>
1 Indian Subcontinent Cattle - dairy 275
2 Indian Subcontinent Cattle - non-dairy 110
3 Indian Subcontinent Buffaloes 295
4 Indian Subcontinent Swine - market 28
5 Indian Subcontinent Swine - breeding 28
6 Indian Subcontinent Chicken - Broilers 0.9
Yes, once it is pivoted long, our resulting data are \(144 x 3\) - exactly what we expected!
---
title: "Challenge 3"
author: "Siddharth Goel"
description: "Tidy Data: Pivoting"
date: "01/27/2023"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_3
- animal_weights.csv
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
## Read in data
```{r}
animal_weights_df <- read_csv("_data/animal_weight.csv")
animal_weights_df
spec(animal_weights_df)
```
### Briefly describe the data
The input dataset describes the weights of different animals and the overall category-wise weights for 9 IPCC areas. For this dataset, I plan to move the animals to rows and the areas to columns using pivot_longer so that it is easy to get statistics based on specific animals and categories.
### Challenge: Describe the final dimensions
Document your work here.
```{r}
# Existing Rows
expected_rows <- nrow(animal_weights_df) # 9
# Existing Columns
expected_cols <- ncol(animal_weights_df) # 17
# Expected Rows
nrow(animal_weights_df) * (expected_cols - 1) # -1 for the header
# expected columns
3 # IPCC Area, Animal, Weight
```
The dataset currently has 9 rows and 17 columns.
When we pivot the dataset, the targeted rows and columns will be 144 rows and 3 columns (Area, Animal and Weight).
## Pivot the Data
Now we will pivot the data, and compare our pivoted data dimensions to the dimensions calculated above as a "sanity" check.
```{r}
pivoted_df <- pivot_longer(animal_weights_df, 2:ncol(animal_weights_df), names_to = "Animal", values_to = "Weight")
head(pivoted_df)
```
Yes, once it is pivoted long, our resulting data are $144 x 3$ - exactly what we expected!