Code
library(tidyverse)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Lindsay Jones
August 17, 2022
Today’s challenge is to:
pivot_longer
Read in one (or more) of the following datasets, using the correct R package and command.
# A tibble: 9 × 17
IPCC A…¹ Cattl…² Cattl…³ Buffa…⁴ Swine…⁵ Swine…⁶ Chick…⁷ Chick…⁸ Ducks Turkeys
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Indian … 275 110 295 28 28 0.9 1.8 2.7 6.8
2 Eastern… 550 391 380 50 180 0.9 1.8 2.7 6.8
3 Africa 275 173 380 28 28 0.9 1.8 2.7 6.8
4 Oceania 500 330 380 45 180 0.9 1.8 2.7 6.8
5 Western… 600 420 380 50 198 0.9 1.8 2.7 6.8
6 Latin A… 400 305 380 28 28 0.9 1.8 2.7 6.8
7 Asia 350 391 380 50 180 0.9 1.8 2.7 6.8
8 Middle … 275 173 380 28 28 0.9 1.8 2.7 6.8
9 Norther… 604 389 380 46 198 0.9 1.8 2.7 6.8
# … with 7 more variables: Sheep <dbl>, Goats <dbl>, Horses <dbl>, Asses <dbl>,
# Mules <dbl>, Camels <dbl>, Llamas <dbl>, and abbreviated variable names
# ¹`IPCC Area`, ²`Cattle - dairy`, ³`Cattle - non-dairy`, ⁴Buffaloes,
# ⁵`Swine - market`, ⁶`Swine - breeding`, ⁷`Chicken - Broilers`,
# ⁸`Chicken - Layers`
# ℹ Use `colnames()` to see all variable names
This data set contains the average weight of 17 different groups of animals in 9 regions of the world. This data set is not tidy because the variables (animal species and average weight) are not set as the columns. Using pivot_longer will fix that.
# A tibble: 144 × 3
`IPCC Area` Species Weight
<chr> <chr> <dbl>
1 Indian Subcontinent Cattle - dairy 275
2 Indian Subcontinent Cattle - non-dairy 110
3 Indian Subcontinent Buffaloes 295
4 Indian Subcontinent Swine - market 28
5 Indian Subcontinent Swine - breeding 28
6 Indian Subcontinent Chicken - Broilers 0.9
7 Indian Subcontinent Chicken - Layers 1.8
8 Indian Subcontinent Ducks 2.7
9 Indian Subcontinent Turkeys 6.8
10 Indian Subcontinent Sheep 28
# … with 134 more rows
# ℹ Use `print(n = ...)` to see more rows
This format makes it more difficult to look at the data for each country, but this could be solved using a few different functions. If we wanted to examine both types of cattle in every country:
# A tibble: 18 × 3
`IPCC Area` Species Weight
<chr> <chr> <dbl>
1 Indian Subcontinent Cattle - dairy 275
2 Indian Subcontinent Cattle - non-dairy 110
3 Eastern Europe Cattle - dairy 550
4 Eastern Europe Cattle - non-dairy 391
5 Africa Cattle - dairy 275
6 Africa Cattle - non-dairy 173
7 Oceania Cattle - dairy 500
8 Oceania Cattle - non-dairy 330
9 Western Europe Cattle - dairy 600
10 Western Europe Cattle - non-dairy 420
11 Latin America Cattle - dairy 400
12 Latin America Cattle - non-dairy 305
13 Asia Cattle - dairy 350
14 Asia Cattle - non-dairy 391
15 Middle east Cattle - dairy 275
16 Middle east Cattle - non-dairy 173
17 Northern America Cattle - dairy 604
18 Northern America Cattle - non-dairy 389
---
title: "Challenge 3"
author: "Lindsay Jones"
description: "Tidy Data: Pivoting"
date: "08/17/2022"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_3
- animal_weight
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
## Challenge Overview
Today's challenge is to:
1. read in a data set, and describe the data set using both words and any supporting information (e.g., tables, etc)
2. identify what needs to be done to tidy the current data
3. anticipate the shape of pivoted data
4. pivot the data into tidy format using `pivot_longer`
## Read in data
Read in one (or more) of the following datasets, using the correct R package and command.
- animal_weights.csv ⭐
- eggs_tidy.csv ⭐⭐ or organicpoultry.xls ⭐⭐⭐
- australian_marriage\*.xlsx ⭐⭐⭐
- USA Households\*.xlsx ⭐⭐⭐⭐
- sce_labor_chart_data_public.csv 🌟🌟🌟🌟🌟
```{r}
animal_weight<-read_csv("_data/animal_weight.csv",
show_col_types = FALSE)
print(animal_weight)
```
### Briefly describe the data
This data set contains the average weight of 17 different groups of animals in 9 regions of the world. This data set is not tidy because the variables (animal species and average weight) are not set as the columns. Using pivot_longer will fix that.
### Challenge: Pivot the Chosen Data
```{r}
aw_pivot <- pivot_longer(animal_weight,
"Cattle - dairy":"Llamas",
names_to = "Species",
values_to = "Weight")
print(aw_pivot)
```
This format makes it more difficult to look at the data for each country, but this could be solved using a few different functions. If we wanted to examine both types of cattle in every country:
```{r}
aw_pivot %>%
filter(grepl('Cattle', Species))
```