DACSS 601: Data Science Fundamentals - FALL 2022
  • Fall 2022 Posts
  • Contributors
  • DACSS

Challenge 1

  • Course information
    • Overview
    • Instructional Team
    • Course Schedule
  • Weekly materials
    • Fall 2022 posts
    • final posts

On this page

  • Read in the Data
  • Describe the data

Challenge 1

  • Show All Code
  • Hide All Code

  • View Source
challenge_1
railroads
faostat
wildbirds
Author

Mariia Dubyk

Published

September 22, 2022

Code
library(tidyverse)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Read in the Data

Code
library(readr)
birds<-read_csv("_data/birds.csv")

Describe the data

Data set represents existence of certain species of birds in different geographic areas (countries, continents, world in general) from 1961 to 2018. We have observation of number of certain birds in population each year (from 1961 to 2018). It is difficult to indicate cases. From columns “Unit” and “Value” we understand number of birds. The variables are for example “Year” and “Area”. So with the data set we may look how situation changed in some area, focus on comparison of populations of different species in one area, etc. The information was probably gathered from the farmers or farming businesses in certain regions.

Code
birds
# A tibble: 30,977 × 14
   Domain Cod…¹ Domain Area …² Area  Eleme…³ Element Item …⁴ Item  Year …⁵  Year
   <chr>        <chr>    <dbl> <chr>   <dbl> <chr>     <dbl> <chr>   <dbl> <dbl>
 1 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1961  1961
 2 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1962  1962
 3 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1963  1963
 4 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1964  1964
 5 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1965  1965
 6 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1966  1966
 7 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1967  1967
 8 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1968  1968
 9 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1969  1969
10 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1970  1970
# … with 30,967 more rows, 4 more variables: Unit <chr>, Value <dbl>,
#   Flag <chr>, `Flag Description` <chr>, and abbreviated variable names
#   ¹​`Domain Code`, ²​`Area Code`, ³​`Element Code`, ⁴​`Item Code`, ⁵​`Year Code`
Code
subset (birds, select = c("Domain", "Area", "Element", "Item", "Year", "Unit", "Value", "Flag", "Flag.Description"))
Error in `x[r, vars, drop = drop]`:
! Can't subset columns that don't exist.
✖ Column `Flag.Description` doesn't exist.
Code
birds2 <- birds [,c("Domain", "Area", "Element", "Item", "Year", "Unit", "Value", "Flag", "Flag.Description")]
Error in `birds[, c("Domain", "Area", "Element", "Item", "Year", "Unit", "Value", "Flag",
    "Flag.Description")]`:
! Can't subset columns that don't exist.
✖ Column `Flag.Description` doesn't exist.
Code
birds2
Error in eval(expr, envir, enclos): object 'birds2' not found
Code
summary (birds2)
Error in summary(birds2): object 'birds2' not found
Code
str(birds2)
Error in str(birds2): object 'birds2' not found
Source Code
---
title: "Challenge 1"
author: "Mariia Dubyk"
desription: "Reading in data and creating a post"
date: "09/22/2022"
format:
  html:
    toc: true
    code-fold: true
    code-copy: true
    code-tools: true
categories:
  - challenge_1
  - railroads
  - faostat
  - wildbirds
---

```{r}
#| label: setup
#| warning: false
#| message: false

library(tidyverse)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```

## Read in the Data

```{r}
library(readr)
birds<-read_csv("_data/birds.csv")
```

## Describe the data
Data set represents existence of certain species of birds in different geographic areas (countries, continents, world in general) from 1961 to 2018. We have observation of number of certain birds in population each year (from 1961 to 2018). It is difficult to indicate cases. From columns "Unit" and "Value" we understand number of birds. The variables are for example "Year" and "Area". So with the data set we may look how situation changed in some area, focus on comparison of populations of different species in one area, etc. The information was probably gathered from the farmers or farming businesses in certain regions.

```{r}
birds
subset (birds, select = c("Domain", "Area", "Element", "Item", "Year", "Unit", "Value", "Flag", "Flag.Description"))
birds2 <- birds [,c("Domain", "Area", "Element", "Item", "Year", "Unit", "Value", "Flag", "Flag.Description")]
birds2


```
```{r}
summary (birds2)

```

```{r}
str(birds2)

```