Challenge 8

challenge_8
faostat
Joining Data
Author

Cristhian Barba Garzon

Published

January 16, 2022

library(tidyverse)
library(ggplot2)
library(readxl)
library(treemap)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Reading and Describing Data

chicken = read_csv("_data/FAOSTAT_egg_chicken.csv")
chicken
country = read_csv("_data/FAOSTAT_country_groups.csv")
country

Tidy Data (as needed)

Data is already tidy; all columns have their respective values–none of the columns need to be mutated to further understand the data. We could possibly remove the column “year” or “year code” as they both represent the same values.

Joining and Describing Data

First, I downsized both data frames in order to choose what specific columns I wanted to look at. This would help minimize time looking through all columns in each data frame. To join the columns, I used a left_join with the country groups as the x-value and the chicken products as the y-value. I wanted to join them using their countries as the equal values, so I had to equate the columns “Country Code” and “Area Code” in the “by” parameter because they represent the same things in both data frames. I also placed the “keep” parameter as TRUE in order to see if all the column data matched once the data frame printed. Afterwards, I removed the similar columns, the area code and the area, to reduce the size of the data frame–both columns were already represented by other similar columns. As you can see in the produce table, the M49 codes from the country groups are now matched with their respective domains, items, and units from the chickens table. If we wanted to add additional columns from the original country groups data, we could simply backtrack to the select portion, and select any additional columns to add before joining the two data sets.

Joining Data

chicken_new = chicken %>%
  select(Domain, `Area Code`, Area, Item, Unit, Value ) #downsides data frame to necessary columns 
chicken_new
country_new = country %>%
  select(`Country Group`, `Country Code`, Country, `M49 Code`) #downsized data frame to necessary columns
country_new
joined = country_new %>%
  left_join(chicken_new, by = c("Country Code"="Area Code"), keep = TRUE) %>% #joins common values from chicken_new to country_new
  subset(select = -c(`Area Code`, Area)) #removed area code and area to clean data/duplicate columns used to join
joined

Analyzing Joined Data

Now that the data is joined, I chose to use a treemap to analyze the value column. This treemap represents the mean values in each country–specifically in the country group of Africa.

joined_plot = joined %>%
  filter(`Country Group` == "Africa") %>%
  group_by(Country) %>%
  summarize(mean_v = mean(Value)) %>%
  treemap(index = "Country" , vSize = "mean_v", type = "index")

joined_plot
$tm
                            Country      vSize vColor     stdErr vColorValue
1                           Algeria  63990.247      1  63990.247          NA
2                            Angola  14301.667      1  14301.667          NA
3                             Benin  11870.305      1  11870.305          NA
4                          Botswana  12250.580      1  12250.580          NA
5                      Burkina Faso  21668.655      1  21668.655          NA
6                           Burundi  11025.339      1  11025.339          NA
7                        Cabo Verde  17509.414      1  17509.414          NA
8                          Cameroon  15424.621      1  15424.621          NA
9          Central African Republic  11928.253      1  11928.253          NA
10                             Chad  16817.621      1  16817.621          NA
11                          Comoros  15974.368      1  15974.368          NA
12                            Congo   9286.431      1   9286.431          NA
13                    Côte d'Ivoire  20653.466      1  20653.466          NA
14 Democratic Republic of the Congo  12800.989      1  12800.989          NA
15                            Egypt  95552.241      1  95552.241          NA
16                Equatorial Guinea  13302.759      1  13302.759          NA
17                          Eritrea  14407.385      1  14407.385          NA
18                         Eswatini  13549.609      1  13549.609          NA
19                     Ethiopia PDR  41150.615      1  41150.615          NA
20                         Ethiopia  30446.436      1  30446.436          NA
21                            Gabon  11760.276      1  11760.276          NA
22                           Gambia  11474.218      1  11474.218          NA
23                            Ghana  16696.897      1  16696.897          NA
24                           Guinea  17089.701      1  17089.701          NA
25                    Guinea-Bissau  16230.069      1  16230.069          NA
26                            Kenya  29968.264      1  29968.264          NA
27                          Lesotho  13319.672      1  13319.672          NA
28                          Liberia  18347.385      1  18347.385          NA
29                            Libya  34736.603      1  34736.603          NA
30                       Madagascar  15595.385      1  15595.385          NA
31                           Malawi  17974.563      1  17974.563          NA
32                             Mali  14829.937      1  14829.937          NA
33                       Mauritania  10831.672      1  10831.672          NA
34                        Mauritius  30681.885      1  30681.885          NA
35                          Morocco  78656.632      1  78656.632          NA
36                       Mozambique  19003.787      1  19003.787          NA
37                          Namibia  11312.356      1  11312.356          NA
38                            Niger  12382.247      1  12382.247          NA
39                          Nigeria 144759.540      1 144759.540          NA
40                          Réunion  36912.253      1  36912.253          NA
41                           Rwanda  11052.063      1  11052.063          NA
42            Sao Tome and Principe  13279.500      1  13279.500          NA
43                          Senegal  14839.494      1  14839.494          NA
44                       Seychelles  24407.753      1  24407.753          NA
45                     Sierra Leone  10783.500      1  10783.500          NA
46                          Somalia  11587.592      1  11587.592          NA
47                     South Africa 127497.575      1 127497.575          NA
48                   Sudan (former)  30982.052      1  30982.052          NA
49                            Sudan  40646.952      1  40646.952          NA
50                             Togo  16786.960      1  16786.960          NA
51                          Tunisia  34973.230      1  34973.230          NA
52                           Uganda  18488.810      1  18488.810          NA
53      United Republic of Tanzania  36559.040      1  36559.040          NA
54                           Zambia  26045.144      1  26045.144          NA
55                         Zimbabwe  19340.874      1  19340.874          NA
   level        x0         y0          w          h   color
1      1 0.2512797 0.45420615 0.17855345 0.24483875 #E19B79
2      1 0.7089374 0.40753152 0.07641633 0.12786011 #C6A856
3      1 0.8630713 0.28963136 0.06878315 0.11790016 #9FB454
4      1 0.7089374 0.00000000 0.08501497 0.09844550 #6ABD74
5      1 0.4298331 0.00000000 0.11335518 0.13059459 #05C19E
6      1 0.9375925 0.16893623 0.06240750 0.12069513 #00BFC5
7      1 0.5431883 0.38557228 0.08982305 0.13317369 #55B6E1
8      1 0.6330114 0.39660149 0.07592599 0.13879014 #9DA7EC
9      1 0.7939523 0.28963136 0.06911894 0.11790016 #CE96E4
10     1 0.5431883 0.12767883 0.08982305 0.12791203 #E88ECA
11     1 0.8203670 0.53539163 0.09089472 0.12006588 #ED90A4
12     1 0.9624456 0.00000000 0.03755439 0.16893623 #D2A361
13     1 0.5431883 0.65545751 0.09844849 0.14332386 #B0B050
14     1 0.7089374 0.19794907 0.08501497 0.10286857 #82BA65
15     1 0.0000000 0.00000000 0.25127969 0.25978741 #3EC08D
16     1 0.9289210 0.40753152 0.07107899 0.12786011 #00C1B6
17     1 0.6330114 0.00000000 0.07592599 0.12963709 #28BBD7
18     1 0.7853537 0.40753152 0.07239796 0.12786011 #82ADEA
19     1 0.2512797 0.29675614 0.17855345 0.15745001 #BE9CEA
20     1 0.4298331 0.61528406 0.11335518 0.18349731 #E190D6
21     1 0.9318544 0.28963136 0.06814558 0.11790016 #ED8EB4
22     1 0.7939523 0.09531615 0.08108139 0.09667997 #E7968A
23     1 0.6330114 0.53539163 0.09500594 0.12006588 #DC9E70
24     1 0.5431883 0.25559085 0.08982305 0.12998143 #C0AB52
25     1 0.7280173 0.53539163 0.09234967 0.12006588 #97B658
26     1 0.4298331 0.43466863 0.11335518 0.18061542 #5DBE7D
27     1 0.8577516 0.40753152 0.07116936 0.12786011 #00C1A7
28     1 0.9125439 0.65545751 0.08745614 0.14332386 #00BECB
29     1 0.6727001 0.79878137 0.11793800 0.20121863 #64B3E4
30     1 0.9112617 0.53539163 0.08873830 0.12006588 #A9A3EC
31     1 0.5431883 0.51874597 0.08982305 0.13671154 #D694E0
32     1 0.6330114 0.12963709 0.07592599 0.13343920 #EB8DC3
33     1 0.8750337 0.08427987 0.08741189 0.08465636 #EC929B
34     1 0.8958286 0.79878137 0.10417139 0.20121863 #CDA55B
35     1 0.2512797 0.69904490 0.17855345 0.30095510 #A9B251
36     1 0.7338286 0.65545751 0.09058500 0.14332386 #76BC6D
37     1 0.7939523 0.00000000 0.08108139 0.09531615 #28C196
38     1 0.7089374 0.09844550 0.08501497 0.09950357 #00C0BE
39     1 0.0000000 0.60642779 0.25127969 0.39357221 #40B9DC
40     1 0.2512797 0.00000000 0.17855345 0.14123324 #91AAEC
41     1 0.8750337 0.16893623 0.06255877 0.12069513 #C699E7
42     1 0.7089374 0.30081764 0.08501497 0.10671388 #E58FD1
43     1 0.6330114 0.26307629 0.07592599 0.13352520 #EE8FAC
44     1 0.4298331 0.13059459 0.11335518 0.14710283 #E49981
45     1 0.8750337 0.00000000 0.08741189 0.08427987 #D8A069
46     1 0.7939523 0.19199612 0.08108139 0.09763524 #B9AD50
47     1 0.0000000 0.25978741 0.25127969 0.34664038 #8CB85F
48     1 0.7906381 0.79878137 0.10519052 0.20121863 #4FBF85
49     1 0.2512797 0.14123324 0.17855345 0.15552290 #00C1AE
50     1 0.5431883 0.00000000 0.08982305 0.12767883 #00BDD1
51     1 0.5539587 0.79878137 0.11874140 0.20121863 #75B0E8
52     1 0.8244136 0.65545751 0.08813027 0.14332386 #B4A0EC
53     1 0.4298331 0.79878137 0.12412555 0.20121863 #DB92DC
54     1 0.4298331 0.27769743 0.11335518 0.15697121 #ED8EBC
55     1 0.6416368 0.65545751 0.09219178 0.14332386 #EA9492

$type
[1] "index"

$vSize
[1] "mean_v"

$vColor
[1] NA

$stdErr
[1] "mean_v"

$algorithm
[1] "pivotSize"

$vpCoorX
[1] 0.02812148 0.97187852

$vpCoorY
[1] 0.01968504 0.91031496

$aspRatio
[1] 1.483512

$range
[1] NA

$mapping
[1] NA NA NA

$draw
[1] TRUE