Code
library(tidyverse)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Priyanka Perumalla
May 15, 2023
Today’s challenge is to:
pivot_longer
Read in one (or more) of the following datasets, using the correct R package and command.
# A tibble: 9 × 17
`IPCC Area` `Cattle - dairy` `Cattle - non-dairy` Buffaloes `Swine - market`
<chr> <dbl> <dbl> <dbl> <dbl>
1 Indian Subco… 275 110 295 28
2 Eastern Euro… 550 391 380 50
3 Africa 275 173 380 28
4 Oceania 500 330 380 45
5 Western Euro… 600 420 380 50
6 Latin America 400 305 380 28
7 Asia 350 391 380 50
8 Middle east 275 173 380 28
9 Northern Ame… 604 389 380 46
# ℹ 12 more variables: `Swine - breeding` <dbl>, `Chicken - Broilers` <dbl>,
# `Chicken - Layers` <dbl>, Ducks <dbl>, Turkeys <dbl>, Sheep <dbl>,
# Goats <dbl>, Horses <dbl>, Asses <dbl>, Mules <dbl>, Camels <dbl>,
# Llamas <dbl>
# A tibble: 6 × 17
`IPCC Area` `Cattle - dairy` `Cattle - non-dairy` Buffaloes `Swine - market`
<chr> <dbl> <dbl> <dbl> <dbl>
1 Indian Subco… 275 110 295 28
2 Eastern Euro… 550 391 380 50
3 Africa 275 173 380 28
4 Oceania 500 330 380 45
5 Western Euro… 600 420 380 50
6 Latin America 400 305 380 28
# ℹ 12 more variables: `Swine - breeding` <dbl>, `Chicken - Broilers` <dbl>,
# `Chicken - Layers` <dbl>, Ducks <dbl>, Turkeys <dbl>, Sheep <dbl>,
# Goats <dbl>, Horses <dbl>, Asses <dbl>, Mules <dbl>, Camels <dbl>,
# Llamas <dbl>
Describe the data, and be sure to comment on why you are planning to pivot it to make it “tidy”
IPCC Area Cattle - dairy Cattle - non-dairy Buffaloes
Length:9 Min. :275.0 Min. :110 Min. :295.0
Class :character 1st Qu.:275.0 1st Qu.:173 1st Qu.:380.0
Mode :character Median :400.0 Median :330 Median :380.0
Mean :425.4 Mean :298 Mean :370.6
3rd Qu.:550.0 3rd Qu.:391 3rd Qu.:380.0
Max. :604.0 Max. :420 Max. :380.0
Swine - market Swine - breeding Chicken - Broilers Chicken - Layers
Min. :28.00 Min. : 28.0 Min. :0.9 Min. :1.8
1st Qu.:28.00 1st Qu.: 28.0 1st Qu.:0.9 1st Qu.:1.8
Median :45.00 Median :180.0 Median :0.9 Median :1.8
Mean :39.22 Mean :116.4 Mean :0.9 Mean :1.8
3rd Qu.:50.00 3rd Qu.:180.0 3rd Qu.:0.9 3rd Qu.:1.8
Max. :50.00 Max. :198.0 Max. :0.9 Max. :1.8
Ducks Turkeys Sheep Goats Horses
Min. :2.7 Min. :6.8 Min. :28.00 Min. :30.00 Min. :238.0
1st Qu.:2.7 1st Qu.:6.8 1st Qu.:28.00 1st Qu.:30.00 1st Qu.:238.0
Median :2.7 Median :6.8 Median :48.50 Median :38.50 Median :377.0
Mean :2.7 Mean :6.8 Mean :39.39 Mean :34.72 Mean :315.2
3rd Qu.:2.7 3rd Qu.:6.8 3rd Qu.:48.50 3rd Qu.:38.50 3rd Qu.:377.0
Max. :2.7 Max. :6.8 Max. :48.50 Max. :38.50 Max. :377.0
Asses Mules Camels Llamas
Min. :130 Min. :130 Min. :217 Min. :217
1st Qu.:130 1st Qu.:130 1st Qu.:217 1st Qu.:217
Median :130 Median :130 Median :217 Median :217
Mean :130 Mean :130 Mean :217 Mean :217
3rd Qu.:130 3rd Qu.:130 3rd Qu.:217 3rd Qu.:217
Max. :130 Max. :130 Max. :217 Max. :217
The data set has information on animal weights by geographical area. There are again different types of categories of animals as columns. The data set in its original orientation gives us the information on how the animal weights are changing country wise depending on the category they animals they fall under.I plan to pivot the table to see the variation in weights of livestock alone preliminarly by country.The inital data set has 9 rows and 17 columns.
The first step in pivoting the data is to try to come up with a concrete vision of what the end product should look like - that way you will know whether or not your pivoting was successful.
I anticipate the data to look smaller and more readable as I am bringing together livestock information. The dimensions that are anticipated at 144 x 3.
Now we will pivot the data, and compare our pivoted data dimensions to the dimensions calculated above as a “sanity” check.
# A tibble: 144 × 3
`IPCC Area` Livestock Weight
<chr> <chr> <dbl>
1 Indian Subcontinent Cattle - dairy 275
2 Indian Subcontinent Cattle - non-dairy 110
3 Indian Subcontinent Buffaloes 295
4 Indian Subcontinent Swine - market 28
5 Indian Subcontinent Swine - breeding 28
6 Indian Subcontinent Chicken - Broilers 0.9
7 Indian Subcontinent Chicken - Layers 1.8
8 Indian Subcontinent Ducks 2.7
9 Indian Subcontinent Turkeys 6.8
10 Indian Subcontinent Sheep 28
# ℹ 134 more rows
Dimensions of pivoted tibble after restructuring the data
Yes, once it is pivoted long, our resulting data are \(144x3\) - exactly what we expected!
Document your work here.
The final dimensions of pivoted data are 144 x 3. This is expected as we used pivot_longer().
---
title: "Challenge 3"
author: "Priyanka Perumalla"
description: "Tidy Data: Pivoting"
date: "05/15/2023"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_3
- Priyanka Perumalla
- animal_weights
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
## Challenge Overview
Today's challenge is to:
1. read in a data set, and describe the data set using both words and any supporting information (e.g., tables, etc)
2. identify what needs to be done to tidy the current data
3. anticipate the shape of pivoted data
4. pivot the data into tidy format using `pivot_longer`
## Read in data
Read in one (or more) of the following datasets, using the correct R package and command.
- animal_weights.csv ⭐
- eggs_tidy.csv ⭐⭐ or organiceggpoultry.xls ⭐⭐⭐
- australian_marriage\*.xls ⭐⭐⭐
- USA Households\*.xlsx ⭐⭐⭐⭐
- sce_labor_chart_data_public.xlsx 🌟🌟🌟🌟🌟
```{r}
animal_weights_data <- read_csv("_data/animal_weight.csv")
print(animal_weights_data,show_col_types = FALSE)
```
```{r}
head(animal_weights_data)
```
### Briefly describe the data
Describe the data, and be sure to comment on why you are planning to pivot it to make it "tidy"
```{r}
nrow(animal_weights_data)
```
```{r}
ncol(animal_weights_data)
```
```{r}
summary(animal_weights_data)
```
The data set has information on animal weights by geographical area. There are again different types of categories of animals as columns. The data set in its original orientation gives us the information on how the animal weights are changing country wise depending on the category they animals they fall under.I plan to pivot the table to see the variation in weights of livestock alone preliminarly by country.The inital data set has 9 rows and 17 columns.
## Anticipate the End Result
The first step in pivoting the data is to try to come up with a concrete vision of what the end product *should* look like - that way you will know whether or not your pivoting was successful.
I anticipate the data to look smaller and more readable as I am bringing together livestock information. The dimensions that are anticipated at 144 x 3.
## Pivot the Data
Now we will pivot the data, and compare our pivoted data dimensions to the dimensions calculated above as a "sanity" check.
```{r}
animal_data_pivoted<-pivot_longer(animal_weights_data, col=-`IPCC Area`,
names_to = "Livestock",
values_to = "Weight")
print(animal_data_pivoted)
```
Dimensions of pivoted tibble after restructuring the data
```{r}
dim(animal_data_pivoted)
```
Yes, once it is pivoted long, our resulting data are $144x3$ - exactly what we expected!
### Challenge: Describe the final dimensions
Document your work here.
The final dimensions of pivoted data are 144 x 3. This is expected as we used pivot_longer().
```{r}
dim(animal_data_pivoted)
```