Code
library(tidyverse)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Akhilesh Kumar Meghwal
August 22, 2022
Today’s challenge is to:
Variable | Stats / Values | Freqs (% of Valid) | Graph | Missing | ||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
IPCC Area [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||
Cattle - dairy [numeric] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||
Cattle - non-dairy [numeric] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||
Buffaloes [numeric] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||
Swine - market [numeric] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||
Swine - breeding [numeric] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||
Chicken - Broilers [numeric] | 1 distinct value |
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||
Chicken - Layers [numeric] | 1 distinct value |
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||
Ducks [numeric] | 1 distinct value |
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||
Turkeys [numeric] | 1 distinct value |
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||
Sheep [numeric] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||
Goats [numeric] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||
Horses [numeric] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||
Asses [numeric] | 1 distinct value |
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||
Mules [numeric] | 1 distinct value |
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||
Camels [numeric] | 1 distinct value |
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||
Llamas [numeric] | 1 distinct value |
|
0 (0.0%) |
Generated by summarytools 1.0.1 (R version 4.2.1)
2022-09-04
t.data.frame.lapply.animal_weight..class...
IPCC.Area character
Cattle...dairy numeric
Cattle...non.dairy numeric
Buffaloes numeric
Swine...market numeric
Swine...breeding numeric
Chicken...Broilers numeric
Chicken...Layers numeric
Ducks numeric
Turkeys numeric
Sheep numeric
Goats numeric
Horses numeric
Asses numeric
Mules numeric
Camels numeric
Llamas numeric
IPCC Area Cattle - dairy Cattle - non-dairy Buffaloes
Length:9 Min. :275.0 Min. :110 Min. :295.0
Class :character 1st Qu.:275.0 1st Qu.:173 1st Qu.:380.0
Mode :character Median :400.0 Median :330 Median :380.0
Mean :425.4 Mean :298 Mean :370.6
3rd Qu.:550.0 3rd Qu.:391 3rd Qu.:380.0
Max. :604.0 Max. :420 Max. :380.0
Swine - market Swine - breeding Chicken - Broilers Chicken - Layers
Min. :28.00 Min. : 28.0 Min. :0.9 Min. :1.8
1st Qu.:28.00 1st Qu.: 28.0 1st Qu.:0.9 1st Qu.:1.8
Median :45.00 Median :180.0 Median :0.9 Median :1.8
Mean :39.22 Mean :116.4 Mean :0.9 Mean :1.8
3rd Qu.:50.00 3rd Qu.:180.0 3rd Qu.:0.9 3rd Qu.:1.8
Max. :50.00 Max. :198.0 Max. :0.9 Max. :1.8
Ducks Turkeys Sheep Goats Horses
Min. :2.7 Min. :6.8 Min. :28.00 Min. :30.00 Min. :238.0
1st Qu.:2.7 1st Qu.:6.8 1st Qu.:28.00 1st Qu.:30.00 1st Qu.:238.0
Median :2.7 Median :6.8 Median :48.50 Median :38.50 Median :377.0
Mean :2.7 Mean :6.8 Mean :39.39 Mean :34.72 Mean :315.2
3rd Qu.:2.7 3rd Qu.:6.8 3rd Qu.:48.50 3rd Qu.:38.50 3rd Qu.:377.0
Max. :2.7 Max. :6.8 Max. :48.50 Max. :38.50 Max. :377.0
Asses Mules Camels Llamas
Min. :130 Min. :130 Min. :217 Min. :217
1st Qu.:130 1st Qu.:130 1st Qu.:217 1st Qu.:217
Median :130 Median :130 Median :217 Median :217
Mean :130 Mean :130 Mean :217 Mean :217
3rd Qu.:130 3rd Qu.:130 3rd Qu.:217 3rd Qu.:217
Max. :130 Max. :130 Max. :217 Max. :217
# A tibble: 144 × 3
`IPCC Area` animal_name weight
<chr> <chr> <dbl>
1 Indian Subcontinent Cattle - dairy 275
2 Indian Subcontinent Cattle - non-dairy 110
3 Indian Subcontinent Buffaloes 295
4 Indian Subcontinent Swine - market 28
5 Indian Subcontinent Swine - breeding 28
6 Indian Subcontinent Chicken - Broilers 0.9
7 Indian Subcontinent Chicken - Layers 1.8
8 Indian Subcontinent Ducks 2.7
9 Indian Subcontinent Turkeys 6.8
10 Indian Subcontinent Sheep 28
# … with 134 more rows
# ℹ Use `print(n = ...)` to see more rows
# A tibble: 144 × 3
`IPCC Area` animal_name weight
<fct> <fct> <dbl>
1 Indian Subcontinent Cattle - dairy 275
2 Indian Subcontinent Cattle - non-dairy 110
3 Indian Subcontinent Buffaloes 295
4 Indian Subcontinent Swine - market 28
5 Indian Subcontinent Swine - breeding 28
6 Indian Subcontinent Chicken - Broilers 0.9
7 Indian Subcontinent Chicken - Layers 1.8
8 Indian Subcontinent Ducks 2.7
9 Indian Subcontinent Turkeys 6.8
10 Indian Subcontinent Sheep 28
# … with 134 more rows
# ℹ Use `print(n = ...)` to see more rows
---
title: "Challenge 4 Akhilesh"
author: "Akhilesh Kumar Meghwal"
desription: "More data wrangling: pivoting"
date: "08/22/2022"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_4
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
## Challenge Overview
Today's challenge is to:
1) read in a data set, and describe the data set using both words and any supporting information (e.g., tables, etc)
2) tidy data (as needed, including sanity checks)
3) identify variables that need to be mutated
4) mutate variables and sanity check all mutations
## Read in data
```{r}
animal_weight<-read_csv("_data/animal_weight.csv",
show_col_types = FALSE)
```
### Briefly describe the data
```{r}
print(summarytools::dfSummary(animal_weight,
varnumbers = FALSE,
plain.ascii = FALSE,
style = "grid",
graph.magnif = 0.50,
valid.col = FALSE),
method = 'render',
table.classes = 'table-condensed')
```
##### Column names of the dataframe
```{r}
colnames(animal_weight)
```
##### Column classes of the dataframe
```{r}
col_classes = data.frame(t(data.frame(lapply(animal_weight,class))))
col_classes
```
##### Summary, Dataframe
```{r}
summary(animal_weight)
```
## Tidy Data (as needed)
##### Is your data already tidy, or is there work to be done? Be sure to anticipate your end result to provide a sanity check, and document your work here.
##### tidy data using pivot_longer, so that obervations represent individual observation and columns represent individual variable
```{r}
animal_weight_pivot <- pivot_longer(animal_weight, col = names(animal_weight)[2:17], names_to = 'animal_name', values_to = 'weight')
animal_weight_pivot
```
## Identify variables that need to be mutated
##### col_classes below provide column wise class of dataframe animal_weight_pivot
##### 'IPCC Area' and 'animal_name' are character class, and converted to factor class using mutate_at
```{r}
col_classes = data.frame(t(data.frame(lapply(animal_weight_pivot,class))))
animal_weight_pivot %>%
mutate_at(c('IPCC Area', 'animal_name'), factor)
```
##### col_classes for sanity check
```{r}
col_classes = data.frame(t(data.frame(lapply(animal_weight_pivot,class))))
col_classes
```