DACSS 601: Data Science Fundamentals - FALL 2022
  • Fall 2022 Posts
  • Contributors
  • DACSS

Challenge 1

  • Course information
    • Overview
    • Instructional Team
    • Course Schedule
  • Weekly materials
    • Fall 2022 posts
    • final posts

On this page

  • Challenge Overview
  • Read in the Data
  • Describe the data

Challenge 1

  • Show All Code
  • Hide All Code

  • View Source
challenge_1
railroads
faostat
wildbirds
Bird Data Summary
Author

Karla Barrett-Dexter

Published

September 25, 2022

Code
library(tidyverse)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Challenge Overview

Today’s challenge is to

  1. read in a dataset, and

  2. describe the dataset using both words and any supporting information (e.g., tables, etc)

Read in the Data

Code
birds <- read_csv("_data/birds.csv")
birds
# A tibble: 30,977 × 14
   Domain Cod…¹ Domain Area …² Area  Eleme…³ Element Item …⁴ Item  Year …⁵  Year
   <chr>        <chr>    <dbl> <chr>   <dbl> <chr>     <dbl> <chr>   <dbl> <dbl>
 1 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1961  1961
 2 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1962  1962
 3 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1963  1963
 4 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1964  1964
 5 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1965  1965
 6 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1966  1966
 7 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1967  1967
 8 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1968  1968
 9 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1969  1969
10 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1970  1970
# … with 30,967 more rows, 4 more variables: Unit <chr>, Value <dbl>,
#   Flag <chr>, `Flag Description` <chr>, and abbreviated variable names
#   ¹​`Domain Code`, ²​`Area Code`, ³​`Element Code`, ⁴​`Item Code`, ⁵​`Year Code`

Describe the data

It appears the data is showing the price per 1000 head of different types of birds.

Basic information about the data is as follows: There are 5 different categories of birds represented in the data set. There are 248 countries represented in the data set. The data was collected over 58 years, 1961-2018.

Code
birds %>% count(Item) #Used to find the summarize the categories of birds and total birds per category
# A tibble: 5 × 2
  Item                       n
  <chr>                  <int>
1 Chickens               13074
2 Ducks                   6909
3 Geese and guinea fowls  4136
4 Pigeons, other birds    1165
5 Turkeys                 5693
Code
Countries<- birds%>%count(Area) #Practiced creating a variable
count(Countries) #Used to find the total number of countries represented in the data
# A tibble: 1 × 1
      n
  <int>
1   248
Code
Years <- birds %>% count(Year)#I used this code to create a cleaner list of the years represented in the data
first(Years) #I used this code to order the years from earliest to latest
 [1] 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975
[16] 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990
[31] 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005
[46] 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018

Using filters to drill down Armenia’s data: There are 2 different categories of birds, Chickens and Turkeys. Armenia data spans 27 years, 1992-2018.

Code
ArmeniaBirds <- filter(birds, Area == "Armenia")
ArmeniaBirds
# A tibble: 54 × 14
   Domain Cod…¹ Domain Area …² Area  Eleme…³ Element Item …⁴ Item  Year …⁵  Year
   <chr>        <chr>    <dbl> <chr>   <dbl> <chr>     <dbl> <chr>   <dbl> <dbl>
 1 QA           Live …       1 Arme…    5112 Stocks     1057 Chic…    1992  1992
 2 QA           Live …       1 Arme…    5112 Stocks     1057 Chic…    1993  1993
 3 QA           Live …       1 Arme…    5112 Stocks     1057 Chic…    1994  1994
 4 QA           Live …       1 Arme…    5112 Stocks     1057 Chic…    1995  1995
 5 QA           Live …       1 Arme…    5112 Stocks     1057 Chic…    1996  1996
 6 QA           Live …       1 Arme…    5112 Stocks     1057 Chic…    1997  1997
 7 QA           Live …       1 Arme…    5112 Stocks     1057 Chic…    1998  1998
 8 QA           Live …       1 Arme…    5112 Stocks     1057 Chic…    1999  1999
 9 QA           Live …       1 Arme…    5112 Stocks     1057 Chic…    2000  2000
10 QA           Live …       1 Arme…    5112 Stocks     1057 Chic…    2001  2001
# … with 44 more rows, 4 more variables: Unit <chr>, Value <dbl>, Flag <chr>,
#   `Flag Description` <chr>, and abbreviated variable names ¹​`Domain Code`,
#   ²​`Area Code`, ³​`Element Code`, ⁴​`Item Code`, ⁵​`Year Code`
Code
ArmeniaBirds %>% count(Item)
# A tibble: 2 × 2
  Item         n
  <chr>    <int>
1 Chickens    27
2 Turkeys     27
Code
ArmeniaYears <- ArmeniaBirds %>% count(Year)
first(ArmeniaYears)
 [1] 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
[16] 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
Source Code
---
title: "Challenge 1"
author: "Karla Barrett-Dexter"
description: "Bird Data Summary"
date: "9/25/2022"
format:
  html:
    toc: true
    code-fold: true
    code-copy: true
    code-tools: true
categories:
  - challenge_1
  - railroads
  - faostat
  - wildbirds
---

```{r}
#| label: setup
#| warning: false
#| message: false

library(tidyverse)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```

## Challenge Overview

Today's challenge is to

1)  read in a dataset, and

2)  describe the dataset using both words and any supporting information (e.g., tables, etc)

## Read in the Data

```{r}
birds <- read_csv("_data/birds.csv")
birds
```



## Describe the data
It appears the data is showing the price per 1000 head of different types of birds.

Basic information about the data is as follows:
There are 5 different categories of birds represented in the data set. 
There are 248 countries represented in the data set.
The data was collected over 58 years, 1961-2018.



```{r}
#| label: summary

birds %>% count(Item) #Used to find the summarize the categories of birds and total birds per category
Countries<- birds%>%count(Area) #Practiced creating a variable
count(Countries) #Used to find the total number of countries represented in the data
Years <- birds %>% count(Year)#I used this code to create a cleaner list of the years represented in the data
first(Years) #I used this code to order the years from earliest to latest


```



Using filters to drill down Armenia's data: 
There are 2 different categories of birds, Chickens and Turkeys.
Armenia data spans 27 years, 1992-2018.

```{r}
#| label: Armenia
ArmeniaBirds <- filter(birds, Area == "Armenia")
ArmeniaBirds
ArmeniaBirds %>% count(Item)
ArmeniaYears <- ArmeniaBirds %>% count(Year)
first(ArmeniaYears)


```