Code
library(tidyverse)
library(dplyr)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Noah Dixon
June 2, 2023
Setup
Using the read_csv function we can load the birds.csv data from the file.
Using the dim function we can see the dimensions of the data.
We can see that there are 14 columns and 30977 rows in the data set. Now, using the colnames and spec functions, we can see the names and data types of each of the 14 columns.
[1] "Domain Code" "Domain" "Area Code" "Area"
[5] "Element Code" "Element" "Item Code" "Item"
[9] "Year Code" "Year" "Unit" "Value"
[13] "Flag" "Flag Description"
cols(
`Domain Code` = col_character(),
Domain = col_character(),
`Area Code` = col_double(),
Area = col_character(),
`Element Code` = col_double(),
Element = col_character(),
`Item Code` = col_double(),
Item = col_character(),
`Year Code` = col_double(),
Year = col_double(),
Unit = col_character(),
Value = col_double(),
Flag = col_character(),
`Flag Description` = col_character()
)
In order to get a better sense of what the data in these columns looks like, we can print the first 6 rows of the data using the head function.
# A tibble: 6 × 14
`Domain Code` Domain `Area Code` Area `Element Code` Element `Item Code`
<chr> <chr> <dbl> <chr> <dbl> <chr> <dbl>
1 QA Live Anima… 2 Afgh… 5112 Stocks 1057
2 QA Live Anima… 2 Afgh… 5112 Stocks 1057
3 QA Live Anima… 2 Afgh… 5112 Stocks 1057
4 QA Live Anima… 2 Afgh… 5112 Stocks 1057
5 QA Live Anima… 2 Afgh… 5112 Stocks 1057
6 QA Live Anima… 2 Afgh… 5112 Stocks 1057
# ℹ 7 more variables: Item <chr>, `Year Code` <dbl>, Year <dbl>, Unit <chr>,
# Value <dbl>, Flag <chr>, `Flag Description` <chr>
We can see that each of the first 6 rows have data from the Area Afghanistan. Using the distinct and select functions, lets see a full list of all the Areas for this data
# A tibble: 248 × 1
Area
<chr>
1 Afghanistan
2 Albania
3 Algeria
4 American Samoa
5 Angola
6 Antigua and Barbuda
7 Argentina
8 Armenia
9 Aruba
10 Australia
# ℹ 238 more rows
We can see that the full list of Areas is extensive, and we can infer that this data was collected from all around the world. Lets do some more select statements to get a better understanding of the data.
# A tibble: 5 × 1
Item
<chr>
1 Chickens
2 Ducks
3 Geese and guinea fowls
4 Turkeys
5 Pigeons, other birds
# A tibble: 58 × 1
Year
<dbl>
1 1961
2 1962
3 1963
4 1964
5 1965
6 1966
7 1967
8 1968
9 1969
10 1970
# ℹ 48 more rows
# A tibble: 1 × 1
Element
<chr>
1 Stocks
# A tibble: 1 × 1
Unit
<chr>
1 1000 Head
From these results we can see that the data set contains the number of “Stocks” of birds in “1000 Head” units for chickens, ducks, geese & guinea fowls, turkeys, and pigeons & other birds for areas all around the world from 1961-2018. Each record contains data specific to a bird type, area, and year.
---
title: "Challenge 1"
author: "Noah Dixon"
description: "Reading in data and creating a post"
date: "6/02/2023"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- challenge_1
- railroads
- faostat
- wildbirds
---
Setup
```{r}
#| label: setup
#| warning: false
#| message: false
library(tidyverse)
library(dplyr)
knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```
## Part 1: Read in the Data
Using the read_csv function we can load the birds.csv data from the file.
```{r}
#| label: read data
birds_from_csv <- read_csv("_data/birds.csv")
```
## Part 2: Describe the data
Using the dim function we can see the dimensions of the data.
```{r}
#| label: dimension of data
dim(birds_from_csv)
```
We can see that there are 14 columns and 30977 rows in the data set. Now, using the colnames and spec functions, we can see the names and data types of each of the 14 columns.
```{r}
#| label: columns of data
colnames(birds_from_csv)
spec(birds_from_csv)
```
In order to get a better sense of what the data in these columns looks like, we can print the first 6 rows of the data using the head function.
```{r}
#| label: head of data
head(birds_from_csv)
```
We can see that each of the first 6 rows have data from the Area Afghanistan. Using the distinct and select functions, lets see a full list of all the Areas for this data
```{r}
#| label: select Area
distinct(select(birds_from_csv, "Area"))
```
We can see that the full list of Areas is extensive, and we can infer that this data was collected from all around the world. Lets do some more select statements to get a better understanding of the data.
```{r}
#| label: select Item
distinct(select(birds_from_csv, "Item"))
distinct(select(birds_from_csv, "Year"))
distinct(select(birds_from_csv, "Element"))
distinct(select(birds_from_csv, "Unit"))
```
From these results we can see that the data set contains the number of "Stocks" of birds in "1000 Head" units for chickens, ducks, geese & guinea fowls, turkeys, and pigeons & other birds for areas all around the world from 1961-2018. Each record contains data specific to a bird type, area, and year.