challenge_1
railroads
faostat
wildbirds
Reading in data and creating a post
Author

Nisarg Shah

Published

March 2, 2023

Code
library(tidyverse)

knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE)

Read in the Data

  • birds.csv ⭐⭐
Code
birds <- read_csv("_data/birds.csv")
birds
# A tibble: 30,977 × 14
   Domain Cod…¹ Domain Area …² Area  Eleme…³ Element Item …⁴ Item  Year …⁵  Year
   <chr>        <chr>    <dbl> <chr>   <dbl> <chr>     <dbl> <chr>   <dbl> <dbl>
 1 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1961  1961
 2 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1962  1962
 3 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1963  1963
 4 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1964  1964
 5 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1965  1965
 6 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1966  1966
 7 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1967  1967
 8 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1968  1968
 9 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1969  1969
10 QA           Live …       2 Afgh…    5112 Stocks     1057 Chic…    1970  1970
# … with 30,967 more rows, 4 more variables: Unit <chr>, Value <dbl>,
#   Flag <chr>, `Flag Description` <chr>, and abbreviated variable names
#   ¹​`Domain Code`, ²​`Area Code`, ³​`Element Code`, ⁴​`Item Code`, ⁵​`Year Code`

Add any comments or documentation as needed. More challenging data sets may require additional code chunks and documentation.

Describe the data

This dataset has 30977 rows and 14 columns. The different type of birds in the dataset are (Chickens, Ducks, Geese and Guinea Fowls, Pigeons other birds, Turkeys). The dataset shows the different area of the birds along with a code designated to the area. The dataset shows the Year, with the amount of units of that bird and the value of the bird as well. The years range from 1961-2018. Each point of data is marked with where the data might have come from such as Official sources or unofficial sources.

Code
head(birds)
# A tibble: 6 × 14
  Domai…¹ Domain Area …² Area  Eleme…³ Element Item …⁴ Item  Year …⁵  Year Unit 
  <chr>   <chr>    <dbl> <chr>   <dbl> <chr>     <dbl> <chr>   <dbl> <dbl> <chr>
1 QA      Live …       2 Afgh…    5112 Stocks     1057 Chic…    1961  1961 1000…
2 QA      Live …       2 Afgh…    5112 Stocks     1057 Chic…    1962  1962 1000…
3 QA      Live …       2 Afgh…    5112 Stocks     1057 Chic…    1963  1963 1000…
4 QA      Live …       2 Afgh…    5112 Stocks     1057 Chic…    1964  1964 1000…
5 QA      Live …       2 Afgh…    5112 Stocks     1057 Chic…    1965  1965 1000…
6 QA      Live …       2 Afgh…    5112 Stocks     1057 Chic…    1966  1966 1000…
# … with 3 more variables: Value <dbl>, Flag <chr>, `Flag Description` <chr>,
#   and abbreviated variable names ¹​`Domain Code`, ²​`Area Code`,
#   ³​`Element Code`, ⁴​`Item Code`, ⁵​`Year Code`
Code
dim(birds)
[1] 30977    14
Code
library(summarytools)
dfSummary(birds)
Data Frame Summary  
birds  
Dimensions: 30977 x 14  
Duplicates: 0  

----------------------------------------------------------------------------------------------------------------------------
No   Variable           Stats / Values                   Freqs (% of Valid)      Graph                  Valid      Missing  
---- ------------------ -------------------------------- ----------------------- ---------------------- ---------- ---------
1    Domain Code        1. QA                            30977 (100.0%)          IIIIIIIIIIIIIIIIIIII   30977      0        
     [character]                                                                                        (100.0%)   (0.0%)   

2    Domain             1. Live Animals                  30977 (100.0%)          IIIIIIIIIIIIIIIIIIII   30977      0        
     [character]                                                                                        (100.0%)   (0.0%)   

3    Area Code          Mean (sd) : 1201.7 (2099.4)      248 distinct values     :                      30977      0        
     [numeric]          min < med < max:                                         :                      (100.0%)   (0.0%)   
                        1 < 156 < 5504                                           :                                          
                        IQR (CV) : 152 (1.7)                                     :                 .                        
                                                                                 :                 :                        

4    Area               1. Africa                          290 ( 0.9%)                                  30977      0        
     [character]        2. Asia                            290 ( 0.9%)                                  (100.0%)   (0.0%)   
                        3. Eastern Asia                    290 ( 0.9%)                                                      
                        4. Egypt                           290 ( 0.9%)                                                      
                        5. Europe                          290 ( 0.9%)                                                      
                        6. France                          290 ( 0.9%)                                                      
                        7. Greece                          290 ( 0.9%)                                                      
                        8. Myanmar                         290 ( 0.9%)                                                      
                        9. Northern Africa                 290 ( 0.9%)                                                      
                        10. South-eastern Asia             290 ( 0.9%)                                                      
                        [ 238 others ]                   28077 (90.6%)           IIIIIIIIIIIIIIIIII                         

5    Element Code       1 distinct value                 5112 : 30977 (100.0%)   IIIIIIIIIIIIIIIIIIII   30977      0        
     [numeric]                                                                                          (100.0%)   (0.0%)   

6    Element            1. Stocks                        30977 (100.0%)          IIIIIIIIIIIIIIIIIIII   30977      0        
     [character]                                                                                        (100.0%)   (0.0%)   

7    Item Code          Mean (sd) : 1066.5 (9)           1057 : 13074 (42.2%)    IIIIIIII               30977      0        
     [numeric]          min < med < max:                 1068 :  6909 (22.3%)    IIII                   (100.0%)   (0.0%)   
                        1057 < 1068 < 1083               1072 :  4136 (13.4%)    II                                         
                        IQR (CV) : 15 (0)                1079 :  5693 (18.4%)    III                                        
                                                         1083 :  1165 ( 3.8%)                                               

8    Item               1. Chickens                      13074 (42.2%)           IIIIIIII               30977      0        
     [character]        2. Ducks                          6909 (22.3%)           IIII                   (100.0%)   (0.0%)   
                        3. Geese and guinea fowls         4136 (13.4%)           II                                         
                        4. Pigeons, other birds           1165 ( 3.8%)                                                      
                        5. Turkeys                        5693 (18.4%)           III                                        

9    Year Code          Mean (sd) : 1990.6 (16.7)        58 distinct values      . . .   . :   : : :    30977      0        
     [numeric]          min < med < max:                                         : : : . : : : : : :    (100.0%)   (0.0%)   
                        1961 < 1992 < 2018                                       : : : : : : : : : :                        
                        IQR (CV) : 29 (0)                                        : : : : : : : : : :                        
                                                                                 : : : : : : : : : :                        

10   Year               Mean (sd) : 1990.6 (16.7)        58 distinct values      . . .   . :   : : :    30977      0        
     [numeric]          min < med < max:                                         : : : . : : : : : :    (100.0%)   (0.0%)   
                        1961 < 1992 < 2018                                       : : : : : : : : : :                        
                        IQR (CV) : 29 (0)                                        : : : : : : : : : :                        
                                                                                 : : : : : : : : : :                        

11   Unit               1. 1000 Head                     30977 (100.0%)          IIIIIIIIIIIIIIIIIIII   30977      0        
     [character]                                                                                        (100.0%)   (0.0%)   

12   Value              Mean (sd) : 99410.6 (720611.4)   11495 distinct values   :                      29941      1036     
     [numeric]          min < med < max:                                         :                      (96.7%)    (3.3%)   
                        0 < 1800 < 23707134                                      :                                          
                        IQR (CV) : 15233 (7.2)                                   :                                          
                                                                                 :                                          

13   Flag               1. *                              1494 ( 7.4%)           I                      20204      10773    
     [character]        2. A                              6488 (32.1%)           IIIIII                 (65.2%)    (34.8%)  
                        3. F                             10007 (49.5%)           IIIIIIIII                                  
                        4. Im                             1213 ( 6.0%)           I                                          
                        5. M                              1002 ( 5.0%)                                                      

14   Flag Description   1. Aggregate, may include of      6488 (20.9%)           IIII                   30977      0        
     [character]        2. Data not available             1002 ( 3.2%)                                  (100.0%)   (0.0%)   
                        3. FAO data based on imputat      1213 ( 3.9%)                                                      
                        4. FAO estimate                  10007 (32.3%)           IIIIII                                     
                        5. Official data                 10773 (34.8%)           IIIIII                                     
                        6. Unofficial figure              1494 ( 4.8%)                                                      
----------------------------------------------------------------------------------------------------------------------------