Code
library(tidyverse)
library(readr)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Kim Darkenwald
August 17, 2022
Today’s challenge is to:
pivot_longer
Read in one (or more) of the following datasets, using the correct R package and command.
[1] 9 17
As indicated in our data,for the most part, animals share similar weights around regions of the globe. However, when it comes to buffalo, cattle, and swine, there are distinct differences in weight. Animals of these categories in particular appear to be much larger in weight in the Northern American and European regions while the regions of the Middle East, Africa, and the Indian Subcontinent contain animals of significantly less weight.
I’m not sure why or how I would pivot this.
# A tibble: 6 × 5
country year trade outgoing incoming
<chr> <dbl> <chr> <dbl> <dbl>
1 Mexico 1980 NAFTA 1243. -134.
2 USA 1990 NAFTA 1271. 666.
3 France 1980 EU 695. 1086.
4 Mexico 1990 NAFTA 1135. 1424.
5 USA 1980 NAFTA 1021. 1545.
6 France 1990 EU 1042. 900.
[1] 6
[1] 5
[1] 12
[1] 5
Or simple example has \(n = 6\) rows and \(k - 3 = 2\) variables being pivoted, so we expect a new dataframe to have \(n * 2 = 12\) rows x \(3 + 2 = 5\) columns.
There are 9 rows and 17 columns, therefore, n = 9 and k =17. I do not see how I would pivot this. Is it because some of the columns have the same number of animals so you would eliminate them?
# A tibble: 36 × 1
animal
<chr>
1 Cattle_Dairy
2 Cattle_Nondairy
3 Swine_Market
4 Swine_Breeding
5 Cattle_Dairy
6 Cattle_Nondairy
7 Swine_Market
8 Swine_Breeding
9 Cattle_Dairy
10 Cattle_Nondairy
# … with 26 more rows
# ℹ Use `print(n = ...)` to see more rows
[1] 36
[1] 1
Error in nrow(fd): object 'fd' not found
Error in `chr_as_locations()`:
! Can't subset columns that don't exist.
✖ Column `outgoing` doesn't exist.
# A tibble: 36 × 1
animal
<chr>
1 Cattle_Dairy
2 Cattle_Nondairy
3 Swine_Market
4 Swine_Breeding
5 Cattle_Dairy
6 Cattle_Nondairy
7 Swine_Market
8 Swine_Breeding
9 Cattle_Dairy
10 Cattle_Nondairy
# … with 26 more rows
# ℹ Use `print(n = ...)` to see more rows
Yes, once it is pivoted long, our resulting data are \(12x5\) - exactly what we expected!
Document your work here. What will a new “case” be once you have pivoted the data? How does it meet requirements for tidy data?
Any additional comments?
---
title: "Challenge 3 Instructions"
author: "Kim Darkenwald"
desription: "Tidy Data: Pivoting"
date: "08/17/2022"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_3
---
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
library(readr)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
## Challenge Overview
Today's challenge is to:
1. read in a data set, and describe the data set using both words and any supporting information (e.g., tables, etc)
2. identify what needs to be done to tidy the current data
3. anticipate the shape of pivoted data
4. pivot the data into tidy format using `pivot_longer`
## Read in data
Read in one (or more) of the following datasets, using the correct R package and command.
- animal_weights.csv ⭐
- eggs_tidy.csv ⭐⭐ or organicpoultry.xls ⭐⭐⭐
- australian_marriage\*.xlsx ⭐⭐⭐
- USA Households\*.xlsx ⭐⭐⭐⭐
- sce_labor_chart_data_public.csv 🌟🌟🌟🌟🌟
```{r}
animal_weight<-read_csv("_data/animal_weight.csv",
show_col_types = FALSE)
view(animal_weight)
dim(animal_weight)
```
### Briefly describe the data
As indicated in our data,for the most part, animals share similar weights around regions of the globe. However, when it comes to buffalo, cattle, and swine, there are distinct differences in weight. Animals of these categories in particular appear to be much larger in weight in the Northern American and European regions while the regions of the Middle East, Africa, and the Indian Subcontinent contain animals of significantly less weight.
I'm not sure why or how I would pivot this.
#
### Example: find current and future data dimensions
```{r}
#| tbl-cap: Example
df<-tibble(country = rep(c("Mexico", "USA", "France"),2),
year = rep(c(1980,1990), 3),
trade = rep(c("NAFTA", "NAFTA", "EU"),2),
outgoing = rnorm(6, mean=1000, sd=500),
incoming = rlogis(6, location=1000,
scale = 400))
df
#existing rows/cases
nrow(df)
#existing columns/cases
ncol(df)
#expected rows/cases
nrow(df) * (ncol(df)-3)
# expected columns
3 + 2
```
Or simple example has $n = 6$ rows and $k - 3 = 2$ variables being pivoted, so we expect a new dataframe to have $n * 2 = 12$ rows x $3 + 2 = 5$ columns.
### Challenge: Describe the final dimensions
There are 9 rows and 17 columns, therefore, n = 9 and k =17. I do not see how I would pivot this. Is it because some of the columns have the same number of animals so you would eliminate them?
```{r}
df<-tibble(animal = rep(c("Cattle_Dairy", "Cattle_Nondairy", "Swine_Market",
"Swine_Breeding"),9))
df
nrow(df)
ncol(df)
nrow(fd) * (ncol(df)-3)
```
### Example
```{r}
#| tbl-cap: Pivoted Example
df<-pivot_longer(df, col = c(outgoing, incoming),
names_to="No Idea",
values_to = "Not Sure")
df
```
Yes, once it is pivoted long, our resulting data are $12x5$ - exactly what we expected!
### Challenge: Pivot the Chosen Data
Document your work here. What will a new "case" be once you have pivoted the data? How does it meet requirements for tidy data?
```{r}
```
Any additional comments?