Challenge 1

challenge_1

railroads

faostat

wildbirds

Reading in data and creating a post

Author

Noah Dixon

Published

June 2, 2023

Setup

Code

library(tidyverse)
library(dplyr)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Part 1: Read in the Data

Using the read_csv function we can load the birds.csv data from the file.

Code

birds_from_csv <- read_csv("_data/birds.csv")

Part 2: Describe the data

Using the dim function we can see the dimensions of the data.

Code

dim(birds_from_csv)

[1] 30977    14

We can see that there are 14 columns and 30977 rows in the data set. Now, using the colnames and spec functions, we can see the names and data types of each of the 14 columns.

Code

colnames(birds_from_csv)

 [1] "Domain Code"      "Domain"           "Area Code"        "Area"            
 [5] "Element Code"     "Element"          "Item Code"        "Item"            
 [9] "Year Code"        "Year"             "Unit"             "Value"           
[13] "Flag"             "Flag Description"

Code

spec(birds_from_csv)

cols(
  `Domain Code` = col_character(),
  Domain = col_character(),
  `Area Code` = col_double(),
  Area = col_character(),
  `Element Code` = col_double(),
  Element = col_character(),
  `Item Code` = col_double(),
  Item = col_character(),
  `Year Code` = col_double(),
  Year = col_double(),
  Unit = col_character(),
  Value = col_double(),
  Flag = col_character(),
  `Flag Description` = col_character()
)

In order to get a better sense of what the data in these columns looks like, we can print the first 6 rows of the data using the head function.

Code

head(birds_from_csv)

# A tibble: 6 × 14
  `Domain Code` Domain      `Area Code` Area  `Element Code` Element `Item Code`
  <chr>         <chr>             <dbl> <chr>          <dbl> <chr>         <dbl>
1 QA            Live Anima…           2 Afgh…           5112 Stocks         1057
2 QA            Live Anima…           2 Afgh…           5112 Stocks         1057
3 QA            Live Anima…           2 Afgh…           5112 Stocks         1057
4 QA            Live Anima…           2 Afgh…           5112 Stocks         1057
5 QA            Live Anima…           2 Afgh…           5112 Stocks         1057
6 QA            Live Anima…           2 Afgh…           5112 Stocks         1057
# ℹ 7 more variables: Item <chr>, `Year Code` <dbl>, Year <dbl>, Unit <chr>,
#   Value <dbl>, Flag <chr>, `Flag Description` <chr>

We can see that each of the first 6 rows have data from the Area Afghanistan. Using the distinct and select functions, lets see a full list of all the Areas for this data

Code

distinct(select(birds_from_csv, "Area"))

# A tibble: 248 × 1
   Area               
   <chr>              
 1 Afghanistan        
 2 Albania            
 3 Algeria            
 4 American Samoa     
 5 Angola             
 6 Antigua and Barbuda
 7 Argentina          
 8 Armenia            
 9 Aruba              
10 Australia          
# ℹ 238 more rows

We can see that the full list of Areas is extensive, and we can infer that this data was collected from all around the world. Lets do some more select statements to get a better understanding of the data.

Code

distinct(select(birds_from_csv, "Item"))

# A tibble: 5 × 1
  Item                  
  <chr>                 
1 Chickens              
2 Ducks                 
3 Geese and guinea fowls
4 Turkeys               
5 Pigeons, other birds

Code

distinct(select(birds_from_csv, "Year"))

# A tibble: 58 × 1
    Year
   <dbl>
 1  1961
 2  1962
 3  1963
 4  1964
 5  1965
 6  1966
 7  1967
 8  1968
 9  1969
10  1970
# ℹ 48 more rows

Code

distinct(select(birds_from_csv, "Element"))

# A tibble: 1 × 1
  Element
  <chr>  
1 Stocks

Code

distinct(select(birds_from_csv, "Unit"))

# A tibble: 1 × 1
  Unit     
  <chr>    
1 1000 Head

From these results we can see that the data set contains the number of “Stocks” of birds in “1000 Head” units for chickens, ducks, geese & guinea fowls, turkeys, and pigeons & other birds for areas all around the world from 1961-2018. Each record contains data specific to a bird type, area, and year.