HW 2

Reading in Data
Author

Rishita Golla

Published

January 21, 2023

 #| label: setup
 #| warning: false
 #| message: false

 library(tidyverse)
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.4.0     ✔ purrr   1.0.1
✔ tibble  3.1.8     ✔ dplyr   1.1.0
✔ tidyr   1.3.0     ✔ stringr 1.5.0
✔ readr   2.1.3     ✔ forcats 1.0.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
 library(ggplot2)

 knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Read in data

 data <- read_csv("_data/FAOSTAT_country_groups.csv")
 view(data)

The data is already tidy and doesn’t require more cleaning.

Narrative, Variables, and Research Question

This data set contains seven variables: Country Group Code: integer Country Group: string Country Code: integer Country: string M49 Code: integer ISO2 Code: integer ISO3 Code: integer

Country Group Code are codes for Country Group (second column). The data also contains country code, country name, and three codes for countries - M49 code (codes for statistical use), ISO2 code (representing country name in 2 letters), and ISO3 code (representing country name in 3 letters).

Hence the overall data set contains countries in the world grouped by different types of country groups (Eg: Africa, America, low-income economies etc). It also contains metadata about countries such as the different types of codes.

One research question we could investigate based on the data is the number of high-income economies that are present in important country groups? When we look closely at the data we see that 5100, 5200, 5300, 5400, 5500, 5600 represent Africa, Americas, Asia, Europe, Oceania, and Antarctic Region resp. We can deduce data such as which world group has the most high-income countries?

The country groups are further divided into sub-country groups. For example, Europe contains Northern Europe, Eastern Europe, Western Europe, and Southern Europe. Another research question we can deduce is which part of Europe has highest number of low-income countries?

Another research question is - the least developed countries belong to which major country groups? To get this data, we can extract the ISO3 code (choosing this has less probability for redundancy) corresponding to least developed countries (which is a country group) and check their continent occurrence.