Code
library(tidyverse)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)Abby Balint
September 27, 2022
I read in the “animal_weight” data set, and renamed it “weights” for easier coding. Then below that I found the summary to get a high level overview of the data (not that it is needed really here since there are only 9 rows originally)
# A tibble: 9 × 17
  IPCC A…¹ Cattl…² Cattl…³ Buffa…⁴ Swine…⁵ Swine…⁶ Chick…⁷ Chick…⁸ Ducks Turkeys
  <chr>      <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl> <dbl>   <dbl>
1 Indian …     275     110     295      28      28     0.9     1.8   2.7     6.8
2 Eastern…     550     391     380      50     180     0.9     1.8   2.7     6.8
3 Africa       275     173     380      28      28     0.9     1.8   2.7     6.8
4 Oceania      500     330     380      45     180     0.9     1.8   2.7     6.8
5 Western…     600     420     380      50     198     0.9     1.8   2.7     6.8
6 Latin A…     400     305     380      28      28     0.9     1.8   2.7     6.8
7 Asia         350     391     380      50     180     0.9     1.8   2.7     6.8
8 Middle …     275     173     380      28      28     0.9     1.8   2.7     6.8
9 Norther…     604     389     380      46     198     0.9     1.8   2.7     6.8
# … with 7 more variables: Sheep <dbl>, Goats <dbl>, Horses <dbl>, Asses <dbl>,
#   Mules <dbl>, Camels <dbl>, Llamas <dbl>, and abbreviated variable names
#   ¹`IPCC Area`, ²`Cattle - dairy`, ³`Cattle - non-dairy`, ⁴Buffaloes,
#   ⁵`Swine - market`, ⁶`Swine - breeding`, ⁷`Chicken - Broilers`,
#   ⁸`Chicken - Layers`  IPCC Area         Cattle - dairy  Cattle - non-dairy   Buffaloes    
 Length:9           Min.   :275.0   Min.   :110        Min.   :295.0  
 Class :character   1st Qu.:275.0   1st Qu.:173        1st Qu.:380.0  
 Mode  :character   Median :400.0   Median :330        Median :380.0  
                    Mean   :425.4   Mean   :298        Mean   :370.6  
                    3rd Qu.:550.0   3rd Qu.:391        3rd Qu.:380.0  
                    Max.   :604.0   Max.   :420        Max.   :380.0  
 Swine - market  Swine - breeding Chicken - Broilers Chicken - Layers
 Min.   :28.00   Min.   : 28.0    Min.   :0.9        Min.   :1.8     
 1st Qu.:28.00   1st Qu.: 28.0    1st Qu.:0.9        1st Qu.:1.8     
 Median :45.00   Median :180.0    Median :0.9        Median :1.8     
 Mean   :39.22   Mean   :116.4    Mean   :0.9        Mean   :1.8     
 3rd Qu.:50.00   3rd Qu.:180.0    3rd Qu.:0.9        3rd Qu.:1.8     
 Max.   :50.00   Max.   :198.0    Max.   :0.9        Max.   :1.8     
     Ducks        Turkeys        Sheep           Goats           Horses     
 Min.   :2.7   Min.   :6.8   Min.   :28.00   Min.   :30.00   Min.   :238.0  
 1st Qu.:2.7   1st Qu.:6.8   1st Qu.:28.00   1st Qu.:30.00   1st Qu.:238.0  
 Median :2.7   Median :6.8   Median :48.50   Median :38.50   Median :377.0  
 Mean   :2.7   Mean   :6.8   Mean   :39.39   Mean   :34.72   Mean   :315.2  
 3rd Qu.:2.7   3rd Qu.:6.8   3rd Qu.:48.50   3rd Qu.:38.50   3rd Qu.:377.0  
 Max.   :2.7   Max.   :6.8   Max.   :48.50   Max.   :38.50   Max.   :377.0  
     Asses         Mules         Camels        Llamas   
 Min.   :130   Min.   :130   Min.   :217   Min.   :217  
 1st Qu.:130   1st Qu.:130   1st Qu.:217   1st Qu.:217  
 Median :130   Median :130   Median :217   Median :217  
 Mean   :130   Mean   :130   Mean   :217   Mean   :217  
 3rd Qu.:130   3rd Qu.:130   3rd Qu.:217   3rd Qu.:217  
 Max.   :130   Max.   :130   Max.   :217   Max.   :217  This dataset contains 17 variables and 9 rows of data related to animal weights by animal as well as region of the world. The reason that pivoting will be helping in visualizing the data here is because in the current format, we cannot filter by animal because each animal is its own variable. Pivoting the data will allow us to filter the data set easily based on animal to find average weights and filter by both animal type and region of the world.
To find the below final dimensions, I used the same formula as the example but used the animal weights data. My original data set started with 9 rows and 17 variables. Only one of the original variables will remain a variable. The 16 variables I am pivoting will turn into two variables - animal (names), and weights (values). My row numbers will now be 144 because I will have the 9 rows times the 16 variables I am transforming. I should end up with 3 columns, my one original variable and my 2 new variables.
[1] 9[1] 17[1] 144[1] 2144 rows as expected :)
# A tibble: 144 × 3
   `IPCC Area`         animal             weights
   <chr>               <chr>                <dbl>
 1 Indian Subcontinent Cattle - dairy       275  
 2 Indian Subcontinent Cattle - non-dairy   110  
 3 Indian Subcontinent Buffaloes            295  
 4 Indian Subcontinent Swine - market        28  
 5 Indian Subcontinent Swine - breeding      28  
 6 Indian Subcontinent Chicken - Broilers     0.9
 7 Indian Subcontinent Chicken - Layers       1.8
 8 Indian Subcontinent Ducks                  2.7
 9 Indian Subcontinent Turkeys                6.8
10 Indian Subcontinent Sheep                 28  
# … with 134 more rowsFinal tibble has three columns and 144 rows as predicted.
---
title: "Challenge 3 Abby Balint"
author: "Abby Balint"
desription: "Tidy Data: Pivoting"
date: "09/27/2022"
format:
  html:
    toc: true
    code-fold: true
    code-copy: true
    code-tools: true
categories:
  - challenge_3
  - animal_weights
  - abby_balint
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
## Read in data
I read in the "animal_weight" data set, and renamed it "weights" for easier coding. Then below that I found the summary to get a high level overview of the data (not that it is needed really here since there are only 9 rows originally)
```{r}
read_csv("_data/animal_weight.csv")
weights <- read_csv("_data/animal_weight.csv")
```
```{r}
summary(weights)
```
### Briefly describe the data
This dataset contains 17 variables and 9 rows of data related to animal weights by animal as well as region of the world. The reason that pivoting will be helping in visualizing the data here is because in the current format, we cannot filter by animal because each animal is its own variable. Pivoting the data will allow us to filter the data set easily based on animal to find average weights and filter by both animal type and region of the world.
### Challenge: Describe the final dimensions
To find the below final dimensions, I used the same formula as the example but used the animal weights data. My original data set started with 9 rows and 17 variables. Only one of the original variables will remain a variable. The 16 variables I am pivoting will turn into two variables - animal (names), and weights (values). My row numbers will now be 144 because I will have the 9 rows times the 16 variables I am transforming. I should end up with 3 columns, my one original variable and my 2 new variables.
```{r}
#existing rows/cases
nrow(weights)
#existing columns/cases
ncol(weights)
#expected rows/cases
nrow(weights) * (ncol(weights)-1)
# expected columns 
1+1
```
144 rows as expected :)
### Challenge: Pivot the Chosen Data
```{r}
pivot_longer(weights, "Cattle - dairy":"Llamas",
                 names_to="animal",
                 values_to = "weights")
```
Final tibble has three columns and 144 rows as predicted.