Reading in the Data (Homework 2): Birds

Practice getting the dataset into RStudio and beginning to understand it.

Eliza Geeslin
09-26-2021

Before we begin…

knitr::opts_chunk$set(echo = TRUE)
library(tidyverse)

Read in the Data

The first step is getting everything up and running.

setwd("../../_data") # set working directory
birds <- read_csv("birds.csv") # assign dataset to variable

First, what are the dimensions and column names?

dim(birds) # dim() returns dimensions of dataset
[1] 30977    14
colnames(birds) # colnames() returns column names
 [1] "Domain Code"      "Domain"           "Area Code"       
 [4] "Area"             "Element Code"     "Element"         
 [7] "Item Code"        "Item"             "Year Code"       
[10] "Year"             "Unit"             "Value"           
[13] "Flag"             "Flag Description"

Okay, and what does this data look like?

head(birds) #head() shows the first few columns of data
# A tibble: 6 x 14
  `Domain Code` Domain       `Area Code` Area   `Element Code` Element
  <chr>         <chr>              <dbl> <chr>           <dbl> <chr>  
1 QA            Live Animals           2 Afgha~           5112 Stocks 
2 QA            Live Animals           2 Afgha~           5112 Stocks 
3 QA            Live Animals           2 Afgha~           5112 Stocks 
4 QA            Live Animals           2 Afgha~           5112 Stocks 
5 QA            Live Animals           2 Afgha~           5112 Stocks 
6 QA            Live Animals           2 Afgha~           5112 Stocks 
# ... with 8 more variables: Item Code <dbl>, Item <chr>,
#   Year Code <dbl>, Year <dbl>, Unit <chr>, Value <dbl>, Flag <chr>,
#   Flag Description <chr>

So, overall it doesn’t seem incredibly messy, just very big!

Conclusion

The next step here is to wrangle the data. Since this set is very big, it will be key to understand what we actually need (what columns etc.) to do the analysis we want to do!

Distill is a publication format for scientific and technical writing, native to the web.

Learn more about using Distill at https://rstudio.github.io/distill.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Geeslin (2021, Sept. 26). DACSS 601 Fall 2021: Reading in the Data (Homework 2): Birds. Retrieved from https://mrolfe.github.io/DACSS601Fall21/posts/2021-09-26-geeslin-hw-2-read-in-data/

BibTeX citation

@misc{geeslin2021reading,
  author = {Geeslin, Eliza},
  title = {DACSS 601 Fall 2021: Reading in the Data (Homework 2): Birds},
  url = {https://mrolfe.github.io/DACSS601Fall21/posts/2021-09-26-geeslin-hw-2-read-in-data/},
  year = {2021}
}