Code
library(tidyverse)
::opts_chunk$set(echo = TRUE) knitr
Matt Eckstein
March 17, 2023
IPCC.Area Cattle...dairy Cattle...non.dairy Buffaloes
1 Indian Subcontinent 275 110 295
2 Eastern Europe 550 391 380
3 Africa 275 173 380
4 Oceania 500 330 380
5 Western Europe 600 420 380
6 Latin America 400 305 380
Swine...market Swine...breeding Chicken...Broilers Chicken...Layers Ducks
1 28 28 0.9 1.8 2.7
2 50 180 0.9 1.8 2.7
3 28 28 0.9 1.8 2.7
4 45 180 0.9 1.8 2.7
5 50 198 0.9 1.8 2.7
6 28 28 0.9 1.8 2.7
Turkeys Sheep Goats Horses Asses Mules Camels Llamas
1 6.8 28.0 30.0 238 130 130 217 217
2 6.8 48.5 38.5 377 130 130 217 217
3 6.8 28.0 30.0 238 130 130 217 217
4 6.8 48.5 38.5 377 130 130 217 217
5 6.8 48.5 38.5 377 130 130 217 217
6 6.8 28.0 30.0 238 130 130 217 217
data frame with 0 columns and 1 row
This data describes the average weights of common types of livestock across regions of the world. Its 17 columns make it somewhat difficult to read, and it could be more legible if it were grouped with only 3 columns and location-livestock type pairs as cases.
There are 17 columns, 16 of which are animals (variables) and not the column containing the names of the observations.
There are 9 observations of 17 variables. I need 1 variable to identify a case, and there will be n * (k - number of variables used to identify a case) rows in the result. 9 * (17-1) = 144. So, we expect the result of our pivoting to have 144 rows.
The final table does in fact have 144 rows and 3 columns.
Now that the data is pivoted, a case is a pairing of an IPCC area and an animal type. This data is tidy because every variable (IPCC area, animal type, and weight) is a column, and every observation (an area-type pairing) is a row.
---
title: "Challenge 3"
author: "Matt Eckstein"
desription: "Challenge 3 - Matt Eckstein - Animal Weight"
date: "03/17/2023"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_2
- Matt Eckstein
- animal_weight.csv
---
```{r}
#| label: setup
#| warning: false
library(tidyverse)
knitr::opts_chunk$set(echo = TRUE)
```
## Read in data
```{r}
animals <- read.csv("_data/animal_weight.csv")
head(animals)
summarize(animals)
```
## Briefly describe the data and Anticipate the End Result
This data describes the average weights of common types of livestock across regions of the world. Its 17 columns make it somewhat difficult to read, and it could be more legible if it were grouped with only 3 columns and location-livestock type pairs as cases.
## Find current and future data dimensions
```{r}
nrow(animals)
ncol(animals)
```
There are 17 columns, 16 of which are animals (variables) and not the column containing the names of the observations.
There are 9 observations of 17 variables. I need 1 variable to identify a case, and there will be n * (k - number of variables used to identify a case) rows in the result. 9 * (17-1) = 144. So, we expect the result of our pivoting to have 144 rows.
```{r}
nrow(animals) * (ncol(animals)-1)
```
## Pivot the Data
```{r}
animals2 <- pivot_longer(animals, `Cattle...dairy`:`Llamas`, names_to = "type", values_to = "weights")
```
## Describe the final dimensions
```{r}
nrow(animals2)
ncol(animals2)
```
The final table does in fact have 144 rows and 3 columns.
## New cases and what makes the new data tidy
Now that the data is pivoted, a case is a pairing of an IPCC area and an animal type. This data is tidy because every variable (IPCC area, animal type, and weight) is a column, and every observation (an area-type pairing) is a row.