challenge_7
Visualization in Multiple Dimensions
Author

Daniel Manning

Published

January 16, 2023

Code
library(tidyverse)
library(ggplot2)
library(here)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Read in data

Code
egg <- here("posts","_data","FAOSTAT_egg_chicken.csv")%>%
  read_csv()
egg_new <- egg %>% 
  select(-c("Domain Code", "Domain", "Area Code", "Element Code", "Item Code", "Year Code", "Flag"))
egg_new
# A tibble: 38,170 × 7
   Area        Element    Item                 Year Unit      Value Flag Descr…¹
   <chr>       <chr>      <chr>               <dbl> <chr>     <dbl> <chr>       
 1 Afghanistan Laying     Eggs, hen, in shell  1961 1000 Head  4000 FAO estimate
 2 Afghanistan Yield      Eggs, hen, in shell  1961 100mg/An  25000 Calculated …
 3 Afghanistan Production Eggs, hen, in shell  1961 tonnes    10000 FAO estimate
 4 Afghanistan Laying     Eggs, hen, in shell  1962 1000 Head  4400 FAO estimate
 5 Afghanistan Yield      Eggs, hen, in shell  1962 100mg/An  25000 Calculated …
 6 Afghanistan Production Eggs, hen, in shell  1962 tonnes    11000 FAO estimate
 7 Afghanistan Laying     Eggs, hen, in shell  1963 1000 Head  4600 FAO estimate
 8 Afghanistan Yield      Eggs, hen, in shell  1963 100mg/An  25000 Calculated …
 9 Afghanistan Production Eggs, hen, in shell  1963 tonnes    11500 FAO estimate
10 Afghanistan Laying     Eggs, hen, in shell  1964 1000 Head  4800 FAO estimate
# … with 38,160 more rows, and abbreviated variable name ¹​`Flag Description`
Code
egg_production <- egg_new[egg_new$Element == 'Production',]
egg_production
# A tibble: 12,840 × 7
   Area        Element    Item                 Year Unit   Value Flag Descript…¹
   <chr>       <chr>      <chr>               <dbl> <chr>  <dbl> <chr>          
 1 Afghanistan Production Eggs, hen, in shell  1961 tonnes 10000 FAO estimate   
 2 Afghanistan Production Eggs, hen, in shell  1962 tonnes 11000 FAO estimate   
 3 Afghanistan Production Eggs, hen, in shell  1963 tonnes 11500 FAO estimate   
 4 Afghanistan Production Eggs, hen, in shell  1964 tonnes 12000 FAO estimate   
 5 Afghanistan Production Eggs, hen, in shell  1965 tonnes 12800 FAO estimate   
 6 Afghanistan Production Eggs, hen, in shell  1966 tonnes 13200 FAO estimate   
 7 Afghanistan Production Eggs, hen, in shell  1967 tonnes 15000 FAO estimate   
 8 Afghanistan Production Eggs, hen, in shell  1968 tonnes 13900 FAO estimate   
 9 Afghanistan Production Eggs, hen, in shell  1969 tonnes 15000 FAO estimate   
10 Afghanistan Production Eggs, hen, in shell  1970 tonnes 13700 FAO estimate   
# … with 12,830 more rows, and abbreviated variable name ¹​`Flag Description`

Time Dependent Visualization

I chose to use a scatterplot to display the production value over time because these are numeric variables and the scatterplot can be used to reveal relationships between them. In addition, I added a color specification for “Flag Description” variable, which describes how the value was calculated/estimated. The scatterplot reveals that the production value of eggs increased from 1960 to 2020, with four countries experiencing much greater growth in production value than the others. In addition, the color specification reveals that the majority of values were calculated from unofficial data, making it difficult to discern if this variable had an effect.

Code
ggplot(egg_production, aes(x=Year, y=Value, color=`Flag Description`)) +
  geom_point(size=.5)+
  labs(title = "Production of Eggs by Year", x = "Year", y = "Production (tonnes)")

Visualizing Part-Whole Relationships over Time

I chose to use a scatterplot with the production values of areas with values above 5,000,000 in order to compare the values of the largest producing areas. I used the color function to specify year in order to see the production value changed over time. This plot reveals that Asian regions, including “Asia”, “China, mainland”, and “Eastern Asia” experienced the greatest growth in egg production from 1960 to 2020.

Code
egg_new <- na.omit(egg_new)
egg_new <- egg_new[egg_new$Element == "Production",]
egg_large <- egg_new[egg_new$Value > 5000000,]
ggplot(egg_large, aes(x=Area,y=Value, color = Year)) + 
    geom_point() +
    theme(axis.text.x = element_text(angle = 90, vjust = 1, hjust=1))

Code
    labs(title = "Production of Eggs by Year", x = "Area", y = "Production (tonnes)")
$x
[1] "Area"

$y
[1] "Production (tonnes)"

$title
[1] "Production of Eggs by Year"

attr(,"class")
[1] "labels"