DACSS 601: Data Science Fundamentals - FALL 2022
  • Fall 2022 Posts
  • Contributors
  • DACSS

Challenge 4

  • Course information
    • Overview
    • Instructional Team
    • Course Schedule
  • Weekly materials
    • Fall 2022 posts
    • final posts

On this page

  • Challenge Overview
  • Read in data
    • Briefly describe the data
  • Tidy Data (as needed)
  • Identify variables that need to be mutated

Challenge 4

  • Show All Code
  • Hide All Code

  • View Source
challenge_4
abc_poll
eggs
fed_rates
hotel_bookings
debt
Author

Lai Wei

Published

November 9, 2022

Code
library(tidyverse)
library(readxl)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Challenge Overview

Today’s challenge is to:

  1. read in a data set, and describe the data set using both words and any supporting information (e.g., tables, etc)
  2. tidy data (as needed, including sanity checks)
  3. identify variables that need to be mutated
  4. mutate variables and sanity check all mutations

Read in data

Read in one (or more) of the following datasets, using the correct R package and command.

  • abc_poll.csv ⭐
  • poultry_tidy.xlsx or organiceggpoultry.xls⭐⭐
  • FedFundsRate.csv⭐⭐⭐
  • hotel_bookings.csv⭐⭐⭐⭐
  • debt_in_trillions.xlsx ⭐⭐⭐⭐⭐
Code
Fed <- read_csv("D:/Umass Amherst/DACSS 601/601_Fall_2022/posts/_data/eggs_tidy.csv")
Error: 'D:/Umass Amherst/DACSS 601/601_Fall_2022/posts/_data/eggs_tidy.csv' does not exist.
Code
Fed
Error in eval(expr, envir, enclos): object 'Fed' not found

Briefly describe the data

Tidy Data (as needed)

Using colnames() function to see each colnames’ catagory.

Code
Fed %>% 
  colnames()
Error in is.data.frame(x): object 'Fed' not found

They look good enough to be understood. I will leave them like that.

Identify variables that need to be mutated

Using mutate() function to create a new colname showing the total amount of dozen in each month.

Code
Fed_total <- mutate(Fed, total = large_half_dozen + large_dozen + extra_large_half_dozen + extra_large_dozen) 
Error in mutate(Fed, total = large_half_dozen + large_dozen + extra_large_half_dozen + : object 'Fed' not found
Code
Fed_total
Error in eval(expr, envir, enclos): object 'Fed_total' not found

Using summarise function to calculate the total number of each year.

Code
group_by(Fed_total,year) %>% 
  summarise(Total = sum(total))
Error in group_by(Fed_total, year): object 'Fed_total' not found
Source Code
---
title: "Challenge 4"
author: "Lai Wei"
desription: "More data wrangling: pivoting"
date: "11/09/2022"
format:
  html:
    toc: true
    code-fold: true
    code-copy: true
    code-tools: true
categories:
  - challenge_4
  - abc_poll
  - eggs
  - fed_rates
  - hotel_bookings
  - debt
---

```{r}
#| label: setup
#| warning: false
#| message: false

library(tidyverse)
library(readxl)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```

## Challenge Overview

Today's challenge is to:

1)  read in a data set, and describe the data set using both words and any supporting information (e.g., tables, etc)
2)  tidy data (as needed, including sanity checks)
3)  identify variables that need to be mutated
4)  mutate variables and sanity check all mutations

## Read in data

Read in one (or more) of the following datasets, using the correct R package and command.

-   abc_poll.csv ⭐
-   poultry_tidy.xlsx or organiceggpoultry.xls⭐⭐
-   FedFundsRate.csv⭐⭐⭐
-   hotel_bookings.csv⭐⭐⭐⭐
-   debt_in_trillions.xlsx ⭐⭐⭐⭐⭐

```{r}
Fed <- read_csv("D:/Umass Amherst/DACSS 601/601_Fall_2022/posts/_data/eggs_tidy.csv")
Fed
```

### Briefly describe the data

## Tidy Data (as needed)

Using colnames() function to see each colnames' catagory. 
```{r}
Fed %>% 
  colnames()
```
They look good enough to be understood. I will leave them like that. 


## Identify variables that need to be mutated

Using mutate() function to create a new colname showing the total amount of dozen in each month. 

```{r}
Fed_total <- mutate(Fed, total = large_half_dozen + large_dozen + extra_large_half_dozen + extra_large_dozen) 
Fed_total
```
Using summarise function to calculate the total number of each year. 
```{r}
group_by(Fed_total,year) %>% 
  summarise(Total = sum(total))
```