DACSS 601: Data Science Fundamentals - FALL 2022
  • Fall 2022 Posts
  • Contributors
  • DACSS

Challenge 1

  • Course information
    • Overview
    • Instructional Team
    • Course Schedule
  • Weekly materials
    • Fall 2022 posts
    • final posts

On this page

  • Challenge Overview
  • Read in the Data
  • Describe the data

Challenge 1

  • Show All Code
  • Hide All Code

  • View Source
challenge_1
railroads
faostat
wildbirds
Author

Janhvi Joshi

Published

October 20, 2022

Code
library(tidyverse)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Challenge Overview

Today’s challenge is to

  1. read in a dataset, and

  2. describe the dataset using both words and any supporting information (e.g., tables, etc)

Read in the Data

Read in one (or more) of the following data sets, using the correct R package and command.

  • railroad_2012_clean_county.csv ⭐
  • birds.csv ⭐⭐
  • FAOstat*.csv ⭐⭐
  • wild_bird_data.xlsx ⭐⭐⭐
  • StateCounty2012.xls ⭐⭐⭐⭐

Find the _data folder, located inside the posts folder. Then you can read in the data, using either one of the readr standard tidy read commands, or a specialized package such as readxl.

Code
bird <- read_csv('_data/birds.csv')
bird
# A tibble: 30,977 × 14
   Domain Cod…¹ Domain Area …² Area  Eleme…³ Element Item …⁴ Item  Year …⁵  Year
   <chr>        <chr>    <dbl> <chr>   <dbl> <chr>     <dbl> <chr>   <dbl> <dbl>
 1 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1961  1961
 2 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1962  1962
 3 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1963  1963
 4 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1964  1964
 5 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1965  1965
 6 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1966  1966
 7 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1967  1967
 8 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1968  1968
 9 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1969  1969
10 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1970  1970
# … with 30,967 more rows, 4 more variables: Unit <chr>, Value <dbl>,
#   Flag <chr>, `Flag Description` <chr>, and abbreviated variable names
#   ¹​`Domain Code`, ²​`Area Code`, ³​`Element Code`, ⁴​`Item Code`, ⁵​`Year Code`
Code
as_tibble(bird)
# A tibble: 30,977 × 14
   Domain Cod…¹ Domain Area …² Area  Eleme…³ Element Item …⁴ Item  Year …⁵  Year
   <chr>        <chr>    <dbl> <chr>   <dbl> <chr>     <dbl> <chr>   <dbl> <dbl>
 1 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1961  1961
 2 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1962  1962
 3 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1963  1963
 4 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1964  1964
 5 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1965  1965
 6 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1966  1966
 7 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1967  1967
 8 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1968  1968
 9 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1969  1969
10 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1970  1970
# … with 30,967 more rows, 4 more variables: Unit <chr>, Value <dbl>,
#   Flag <chr>, `Flag Description` <chr>, and abbreviated variable names
#   ¹​`Domain Code`, ²​`Area Code`, ³​`Element Code`, ⁴​`Item Code`, ⁵​`Year Code`

Add any comments or documentation as needed. More challenging data sets may require additional code chunks and documentation.

Describe the data

Description: The birds dataset contains approximately 30k rows and 14 columns on different stock birds from 248 regions like Afghanistan, Albania, Jamaica and Micronesia. All these birds belong to the domain of “Live Animals”. The birds are categorized in one of the following categories: “Chickens”, “Ducks”, “Geese and guinea fowls”, “Turkeys”, “Pigeons, other birds”. The dataset spans form the year 1961 to 2018. It was gathered form various official and unofficial sources like FAO estimates, unofficial figures likely gathered from livestock farms, as well as aggregates calculated from a combination of these sources. This dataset is likely useful to analyse when, where and how many different livestock birds were grown.

Code
nrow(bird)
[1] 30977
Code
colnames(bird)[4] <- c("area")
unique_areas <- unique(bird$area)
unique_areas
  [1] "Afghanistan"                                         
  [2] "Albania"                                             
  [3] "Algeria"                                             
  [4] "American Samoa"                                      
  [5] "Angola"                                              
  [6] "Antigua and Barbuda"                                 
  [7] "Argentina"                                           
  [8] "Armenia"                                             
  [9] "Aruba"                                               
 [10] "Australia"                                           
 [11] "Austria"                                             
 [12] "Azerbaijan"                                          
 [13] "Bahamas"                                             
 [14] "Bahrain"                                             
 [15] "Bangladesh"                                          
 [16] "Barbados"                                            
 [17] "Belarus"                                             
 [18] "Belgium"                                             
 [19] "Belgium-Luxembourg"                                  
 [20] "Belize"                                              
 [21] "Benin"                                               
 [22] "Bermuda"                                             
 [23] "Bhutan"                                              
 [24] "Bolivia (Plurinational State of)"                    
 [25] "Bosnia and Herzegovina"                              
 [26] "Botswana"                                            
 [27] "Brazil"                                              
 [28] "Brunei Darussalam"                                   
 [29] "Bulgaria"                                            
 [30] "Burkina Faso"                                        
 [31] "Burundi"                                             
 [32] "Cabo Verde"                                          
 [33] "Cambodia"                                            
 [34] "Cameroon"                                            
 [35] "Canada"                                              
 [36] "Cayman Islands"                                      
 [37] "Central African Republic"                            
 [38] "Chad"                                                
 [39] "Chile"                                               
 [40] "China, Hong Kong SAR"                                
 [41] "China, Macao SAR"                                    
 [42] "China, mainland"                                     
 [43] "China, Taiwan Province of"                           
 [44] "Colombia"                                            
 [45] "Comoros"                                             
 [46] "Congo"                                               
 [47] "Cook Islands"                                        
 [48] "Costa Rica"                                          
 [49] "Côte d'Ivoire"                                       
 [50] "Croatia"                                             
 [51] "Cuba"                                                
 [52] "Cyprus"                                              
 [53] "Czechia"                                             
 [54] "Czechoslovakia"                                      
 [55] "Democratic People's Republic of Korea"               
 [56] "Democratic Republic of the Congo"                    
 [57] "Denmark"                                             
 [58] "Dominica"                                            
 [59] "Dominican Republic"                                  
 [60] "Ecuador"                                             
 [61] "Egypt"                                               
 [62] "El Salvador"                                         
 [63] "Equatorial Guinea"                                   
 [64] "Eritrea"                                             
 [65] "Estonia"                                             
 [66] "Eswatini"                                            
 [67] "Ethiopia"                                            
 [68] "Ethiopia PDR"                                        
 [69] "Falkland Islands (Malvinas)"                         
 [70] "Fiji"                                                
 [71] "Finland"                                             
 [72] "France"                                              
 [73] "French Guyana"                                       
 [74] "French Polynesia"                                    
 [75] "Gabon"                                               
 [76] "Gambia"                                              
 [77] "Georgia"                                             
 [78] "Germany"                                             
 [79] "Ghana"                                               
 [80] "Greece"                                              
 [81] "Grenada"                                             
 [82] "Guadeloupe"                                          
 [83] "Guam"                                                
 [84] "Guatemala"                                           
 [85] "Guinea"                                              
 [86] "Guinea-Bissau"                                       
 [87] "Guyana"                                              
 [88] "Haiti"                                               
 [89] "Honduras"                                            
 [90] "Hungary"                                             
 [91] "Iceland"                                             
 [92] "India"                                               
 [93] "Indonesia"                                           
 [94] "Iran (Islamic Republic of)"                          
 [95] "Iraq"                                                
 [96] "Ireland"                                             
 [97] "Israel"                                              
 [98] "Italy"                                               
 [99] "Jamaica"                                             
[100] "Japan"                                               
[101] "Jordan"                                              
[102] "Kazakhstan"                                          
[103] "Kenya"                                               
[104] "Kiribati"                                            
[105] "Kuwait"                                              
[106] "Kyrgyzstan"                                          
[107] "Lao People's Democratic Republic"                    
[108] "Latvia"                                              
[109] "Lebanon"                                             
[110] "Lesotho"                                             
[111] "Liberia"                                             
[112] "Libya"                                               
[113] "Liechtenstein"                                       
[114] "Lithuania"                                           
[115] "Luxembourg"                                          
[116] "Madagascar"                                          
[117] "Malawi"                                              
[118] "Malaysia"                                            
[119] "Mali"                                                
[120] "Malta"                                               
[121] "Martinique"                                          
[122] "Mauritania"                                          
[123] "Mauritius"                                           
[124] "Mexico"                                              
[125] "Micronesia (Federated States of)"                    
[126] "Mongolia"                                            
[127] "Montenegro"                                          
[128] "Montserrat"                                          
[129] "Morocco"                                             
[130] "Mozambique"                                          
[131] "Myanmar"                                             
[132] "Namibia"                                             
[133] "Nauru"                                               
[134] "Nepal"                                               
[135] "Netherlands"                                         
[136] "Netherlands Antilles (former)"                       
[137] "New Caledonia"                                       
[138] "New Zealand"                                         
[139] "Nicaragua"                                           
[140] "Niger"                                               
[141] "Nigeria"                                             
[142] "Niue"                                                
[143] "North Macedonia"                                     
[144] "Norway"                                              
[145] "Oman"                                                
[146] "Pacific Islands Trust Territory"                     
[147] "Pakistan"                                            
[148] "Palestine"                                           
[149] "Panama"                                              
[150] "Papua New Guinea"                                    
[151] "Paraguay"                                            
[152] "Peru"                                                
[153] "Philippines"                                         
[154] "Poland"                                              
[155] "Portugal"                                            
[156] "Puerto Rico"                                         
[157] "Qatar"                                               
[158] "Republic of Korea"                                   
[159] "Republic of Moldova"                                 
[160] "Réunion"                                             
[161] "Romania"                                             
[162] "Russian Federation"                                  
[163] "Rwanda"                                              
[164] "Saint Helena, Ascension and Tristan da Cunha"        
[165] "Saint Kitts and Nevis"                               
[166] "Saint Lucia"                                         
[167] "Saint Pierre and Miquelon"                           
[168] "Saint Vincent and the Grenadines"                    
[169] "Samoa"                                               
[170] "Sao Tome and Principe"                               
[171] "Saudi Arabia"                                        
[172] "Senegal"                                             
[173] "Serbia"                                              
[174] "Serbia and Montenegro"                               
[175] "Seychelles"                                          
[176] "Sierra Leone"                                        
[177] "Singapore"                                           
[178] "Slovakia"                                            
[179] "Slovenia"                                            
[180] "Solomon Islands"                                     
[181] "Somalia"                                             
[182] "South Africa"                                        
[183] "South Sudan"                                         
[184] "Spain"                                               
[185] "Sri Lanka"                                           
[186] "Sudan"                                               
[187] "Sudan (former)"                                      
[188] "Suriname"                                            
[189] "Sweden"                                              
[190] "Switzerland"                                         
[191] "Syrian Arab Republic"                                
[192] "Tajikistan"                                          
[193] "Thailand"                                            
[194] "Timor-Leste"                                         
[195] "Togo"                                                
[196] "Tokelau"                                             
[197] "Tonga"                                               
[198] "Trinidad and Tobago"                                 
[199] "Tunisia"                                             
[200] "Turkey"                                              
[201] "Turkmenistan"                                        
[202] "Tuvalu"                                              
[203] "Uganda"                                              
[204] "Ukraine"                                             
[205] "United Arab Emirates"                                
[206] "United Kingdom of Great Britain and Northern Ireland"
[207] "United Republic of Tanzania"                         
[208] "United States of America"                            
[209] "United States Virgin Islands"                        
[210] "Uruguay"                                             
[211] "USSR"                                                
[212] "Uzbekistan"                                          
[213] "Vanuatu"                                             
[214] "Venezuela (Bolivarian Republic of)"                  
[215] "Viet Nam"                                            
[216] "Wallis and Futuna Islands"                           
[217] "Yemen"                                               
[218] "Yugoslav SFR"                                        
[219] "Zambia"                                              
[220] "Zimbabwe"                                            
[221] "World"                                               
[222] "Africa"                                              
[223] "Eastern Africa"                                      
[224] "Middle Africa"                                       
[225] "Northern Africa"                                     
[226] "Southern Africa"                                     
[227] "Western Africa"                                      
[228] "Americas"                                            
[229] "Northern America"                                    
[230] "Central America"                                     
[231] "Caribbean"                                           
[232] "South America"                                       
[233] "Asia"                                                
[234] "Central Asia"                                        
[235] "Eastern Asia"                                        
[236] "Southern Asia"                                       
[237] "South-eastern Asia"                                  
[238] "Western Asia"                                        
[239] "Europe"                                              
[240] "Eastern Europe"                                      
[241] "Northern Europe"                                     
[242] "Southern Europe"                                     
[243] "Western Europe"                                      
[244] "Oceania"                                             
[245] "Australia and New Zealand"                           
[246] "Melanesia"                                           
[247] "Micronesia"                                          
[248] "Polynesia"                                           
Code
colnames(bird)[2] <- c("domain")
unique_domains <- unique(bird$domain)
unique_domains
[1] "Live Animals"
Code
colnames(bird)[6] <- c("element")
unique_element <- unique(bird$element)
unique_element
[1] "Stocks"
Code
colnames(bird)[8] <- c("item")
unique_items <- unique(bird$item)
unique_items
[1] "Chickens"               "Ducks"                  "Geese and guinea fowls"
[4] "Turkeys"                "Pigeons, other birds"  
Code
summary(bird)
 Domain Code           domain            Area Code        area          
 Length:30977       Length:30977       Min.   :   1   Length:30977      
 Class :character   Class :character   1st Qu.:  79   Class :character  
 Mode  :character   Mode  :character   Median : 156   Mode  :character  
                                       Mean   :1202                     
                                       3rd Qu.: 231                     
                                       Max.   :5504                     
                                                                        
  Element Code    element            Item Code        item          
 Min.   :5112   Length:30977       Min.   :1057   Length:30977      
 1st Qu.:5112   Class :character   1st Qu.:1057   Class :character  
 Median :5112   Mode  :character   Median :1068   Mode  :character  
 Mean   :5112                      Mean   :1066                     
 3rd Qu.:5112                      3rd Qu.:1072                     
 Max.   :5112                      Max.   :1083                     
                                                                    
   Year Code         Year          Unit               Value         
 Min.   :1961   Min.   :1961   Length:30977       Min.   :       0  
 1st Qu.:1976   1st Qu.:1976   Class :character   1st Qu.:     171  
 Median :1992   Median :1992   Mode  :character   Median :    1800  
 Mean   :1991   Mean   :1991                      Mean   :   99411  
 3rd Qu.:2005   3rd Qu.:2005                      3rd Qu.:   15404  
 Max.   :2018   Max.   :2018                      Max.   :23707134  
                                                  NA's   :1036      
     Flag           Flag Description  
 Length:30977       Length:30977      
 Class :character   Class :character  
 Mode  :character   Mode  :character  
                                      
                                      
                                      
                                      
Code
colnames(bird)[14] <- c("flag")
unique_flag <- unique(bird$flag)
unique_flag
[1] "FAO estimate"                                                                
[2] "Official data"                                                               
[3] "FAO data based on imputation methodology"                                    
[4] "Data not available"                                                          
[5] "Unofficial figure"                                                           
[6] "Aggregate, may include official, semi-official, estimated or calculated data"
Source Code
---
title: "Challenge 1"
author: "Janhvi Joshi"
desription: "Reading in data and creating a post"
date: "10/20/2022"
format:
  html:
    toc: true
    code-fold: true
    code-copy: true
    code-tools: true
categories:
  - challenge_1
  - railroads
  - faostat
  - wildbirds
---

```{r}
#| label: setup
#| warning: false
#| message: false

library(tidyverse)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)
```

## Challenge Overview

Today's challenge is to

1)  read in a dataset, and

2)  describe the dataset using both words and any supporting information (e.g., tables, etc)

## Read in the Data

Read in one (or more) of the following data sets, using the correct R package and command.

-   railroad_2012_clean_county.csv ⭐
-   birds.csv ⭐⭐
-   FAOstat\*.csv ⭐⭐
-   wild_bird_data.xlsx ⭐⭐⭐
-   StateCounty2012.xls ⭐⭐⭐⭐

Find the `_data` folder, located inside the `posts` folder. Then you can read in the data, using either one of the `readr` standard tidy read commands, or a specialized package such as `readxl`.

```{r}
bird <- read_csv('_data/birds.csv')
bird
as_tibble(bird)
```

Add any comments or documentation as needed. More challenging data sets may require additional code chunks and documentation.

## Describe the data

Description: The birds dataset contains approximately 30k rows and 14 columns on different stock birds from 248 regions like Afghanistan, Albania, Jamaica and Micronesia. All these birds belong to the domain of "Live Animals". The birds are categorized in one of the following categories: "Chickens", "Ducks", "Geese and guinea fowls", "Turkeys", "Pigeons, other birds". The dataset spans form the year 1961 to 2018. It was gathered form various official and unofficial sources like FAO estimates, unofficial figures likely gathered from livestock farms, as well as aggregates calculated from a combination of these sources. This dataset is likely useful to analyse when, where and how many different livestock birds were grown.

```{r}
#| label: summary
nrow(bird)
colnames(bird)[4] <- c("area")
unique_areas <- unique(bird$area)
unique_areas
colnames(bird)[2] <- c("domain")
unique_domains <- unique(bird$domain)
unique_domains
colnames(bird)[6] <- c("element")
unique_element <- unique(bird$element)
unique_element
colnames(bird)[8] <- c("item")
unique_items <- unique(bird$item)
unique_items

summary(bird)

colnames(bird)[14] <- c("flag")
unique_flag <- unique(bird$flag)
unique_flag
```