Code
library(tidyverse)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Tejaswini_Ketineni
August 21, 2022
The data that we consider for doing the current challenge is animal weight
[1] "IPCC Area" "Cattle - dairy" "Cattle - non-dairy"
[4] "Buffaloes" "Swine - market" "Swine - breeding"
[7] "Chicken - Broilers" "Chicken - Layers" "Ducks"
[10] "Turkeys" "Sheep" "Goats"
[13] "Horses" "Asses" "Mules"
[16] "Camels" "Llamas"
It has 9 rows
It has 17 columns
The dimensions of the table : 9,17
IPCC Area Cattle - dairy Cattle - non-dairy Buffaloes
Length:9 Min. :275.0 Min. :110 Min. :295.0
Class :character 1st Qu.:275.0 1st Qu.:173 1st Qu.:380.0
Mode :character Median :400.0 Median :330 Median :380.0
Mean :425.4 Mean :298 Mean :370.6
3rd Qu.:550.0 3rd Qu.:391 3rd Qu.:380.0
Max. :604.0 Max. :420 Max. :380.0
Swine - market Swine - breeding Chicken - Broilers Chicken - Layers
Min. :28.00 Min. : 28.0 Min. :0.9 Min. :1.8
1st Qu.:28.00 1st Qu.: 28.0 1st Qu.:0.9 1st Qu.:1.8
Median :45.00 Median :180.0 Median :0.9 Median :1.8
Mean :39.22 Mean :116.4 Mean :0.9 Mean :1.8
3rd Qu.:50.00 3rd Qu.:180.0 3rd Qu.:0.9 3rd Qu.:1.8
Max. :50.00 Max. :198.0 Max. :0.9 Max. :1.8
Ducks Turkeys Sheep Goats Horses
Min. :2.7 Min. :6.8 Min. :28.00 Min. :30.00 Min. :238.0
1st Qu.:2.7 1st Qu.:6.8 1st Qu.:28.00 1st Qu.:30.00 1st Qu.:238.0
Median :2.7 Median :6.8 Median :48.50 Median :38.50 Median :377.0
Mean :2.7 Mean :6.8 Mean :39.39 Mean :34.72 Mean :315.2
3rd Qu.:2.7 3rd Qu.:6.8 3rd Qu.:48.50 3rd Qu.:38.50 3rd Qu.:377.0
Max. :2.7 Max. :6.8 Max. :48.50 Max. :38.50 Max. :377.0
Asses Mules Camels Llamas
Min. :130 Min. :130 Min. :217 Min. :217
1st Qu.:130 1st Qu.:130 1st Qu.:217 1st Qu.:217
Median :130 Median :130 Median :217 Median :217
Mean :130 Mean :130 Mean :217 Mean :217
3rd Qu.:130 3rd Qu.:130 3rd Qu.:217 3rd Qu.:217
Max. :130 Max. :130 Max. :217 Max. :217
while we observe the data set, we see that there are no missing values.
# A tibble: 6 × 17
IPCC A…¹ Cattl…² Cattl…³ Buffa…⁴ Swine…⁵ Swine…⁶ Chick…⁷ Chick…⁸ Ducks Turkeys
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Indian … 275 110 295 28 28 0.9 1.8 2.7 6.8
2 Eastern… 550 391 380 50 180 0.9 1.8 2.7 6.8
3 Africa 275 173 380 28 28 0.9 1.8 2.7 6.8
4 Oceania 500 330 380 45 180 0.9 1.8 2.7 6.8
5 Western… 600 420 380 50 198 0.9 1.8 2.7 6.8
6 Latin A… 400 305 380 28 28 0.9 1.8 2.7 6.8
# … with 7 more variables: Sheep <dbl>, Goats <dbl>, Horses <dbl>, Asses <dbl>,
# Mules <dbl>, Camels <dbl>, Llamas <dbl>, and abbreviated variable names
# ¹`IPCC Area`, ²`Cattle - dairy`, ³`Cattle - non-dairy`, ⁴Buffaloes,
# ⁵`Swine - market`, ⁶`Swine - breeding`, ⁷`Chicken - Broilers`,
# ⁸`Chicken - Layers`
when we observe the data using head, we see that weight of the category of types of animals is distributed across the regions, performing pivot would help to avoid the reccurence of weights across the regions. All the types of animals can be named as the Animal type and it would have 3 columns : IPCC Area, weight, Animaltype.
As we have already computed the no.of rows and columns, now we must compute the total expected rows in the pivoted data.
As per the discussions above, there must be 144 rows and 3 columns
Now we will pivot the data,
df<- pivot_longer(animal_weight,
col = c('Cattle - dairy', 'Cattle - non-dairy', 'Buffaloes', 'Swine - market', 'Swine - breeding', 'Chicken - Broilers', 'Chicken - Layers', 'Ducks', 'Turkeys', 'Sheep', 'Goats', 'Horses', 'Asses', 'Mules', 'Camels', 'Llamas'), names_to = 'Animal Type', values_to = 'Weight')
df
# A tibble: 144 × 3
`IPCC Area` `Animal Type` Weight
<chr> <chr> <dbl>
1 Indian Subcontinent Cattle - dairy 275
2 Indian Subcontinent Cattle - non-dairy 110
3 Indian Subcontinent Buffaloes 295
4 Indian Subcontinent Swine - market 28
5 Indian Subcontinent Swine - breeding 28
6 Indian Subcontinent Chicken - Broilers 0.9
7 Indian Subcontinent Chicken - Layers 1.8
8 Indian Subcontinent Ducks 2.7
9 Indian Subcontinent Turkeys 6.8
10 Indian Subcontinent Sheep 28
# … with 134 more rows
Computing the number of rows and columns for the pivoted data.
IPCC Area Animal Type Weight
Length:144 Length:144 Min. : 0.9
Class :character Class :character 1st Qu.: 22.7
Mode :character Mode :character Median :130.0
Mean :146.6
3rd Qu.:217.0
Max. :604.0
performing summary function ensures that there are three columns and there are no missing values as well, which ensures that the quality of the data is ensured.
---
title: "Challenge 3 Instructions"
author: "Tejaswini_Ketineni"
desription: "Reading in data and creating a post"
date: "08/21/2022"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_1
- railroads
- faostat
- wildbirds
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
## Challenge Overview
## Read in data
The data that we consider for doing the current challenge is animal weight
```{r}
library(readr)
animal_weight <- read_csv("_data/animal_weight.csv")
```
### Briefly describe the data
```{r}
colnames(animal_weight)
```
```{r}
nrow(animal_weight)
```
It has 9 rows
```{r}
ncol(animal_weight)
```
It has 17 columns
```{r}
dim(animal_weight)
```
The dimensions of the table : 9,17
```{r}
summary(animal_weight)
```
while we observe the data set, we see that there are no missing values.
```{r}
head(animal_weight)
```
when we observe the data using head, we see that weight of the category of types of animals is distributed across the regions, performing pivot would help to avoid the reccurence of weights across the regions. All the types of animals can be named as the Animal type and it would have 3 columns : IPCC Area, weight, Animaltype.
## Anticipate the End Result
As we have already computed the no.of rows and columns, now we must compute the total expected rows in the pivoted data.
```{r}
nrow(animal_weight)*(ncol(animal_weight)-1)
```
As per the discussions above, there must be 144 rows and 3 columns
## Pivot the Data
Now we will pivot the data,
```{r}
df<- pivot_longer(animal_weight,
col = c('Cattle - dairy', 'Cattle - non-dairy', 'Buffaloes', 'Swine - market', 'Swine - breeding', 'Chicken - Broilers', 'Chicken - Layers', 'Ducks', 'Turkeys', 'Sheep', 'Goats', 'Horses', 'Asses', 'Mules', 'Camels', 'Llamas'), names_to = 'Animal Type', values_to = 'Weight')
df
```
### cross checking if the pivoted data has met the expectations
Computing the number of rows and columns for the pivoted data.
```{r}
nrow(df)
```
```{r}
ncol(df)
```
```{r}
dim(df)
```
```{r}
summary(df)
```
performing summary function ensures that there are three columns and there are no missing values as well, which ensures that the quality of the data is ensured.