Code
# load libraries.
library(tidyverse)
library(ggplot2)
library(summarytools)
::opts_chunk$set(echo = TRUE) knitr
Saaradhaa M
October 9, 2022
Prior research literature in the social sciences has continually stressed the need for more research on the Global South. However, few papers actually focus on it. Hence, I am interested to learn more about this region. A data source that lends itself useful for this is the World Values Survey, a global survey with an easily accessible database.
I am specifically interested in understanding what drives subjective well-being, which can be interpreted via happiness and life satisfaction (Addai et al., 2013).
This project will be useful to better understand motivations and desires in the Global South, reduce inter-cultural tensions and enhance cross-cultural cohesion. Governments can also benefit from this research in terms of policy prioritization to maximize citizens’ well-being.
Past researchers have studied happiness and life satisfaction in the Global South via the World Values Survey (Addai et al., 2013; Ngamaba, 2016). The studies focused on Ghana and Rwanda respectively. The common predictors of happiness and life satisfaction across both countries were satisfaction with health and income.
To the best of my knowledge, few studies comparing well-being in the Global North and South exist. Alba (2019) found that happiness was generally greater in the Global North than the Global South, and indicated that future research should attempt to cover the factors behind this. I think happiness and well-being in the Global North may depend on more subjective measures, given that health and income-related issues should be relatively more accounted for.
Given the above, we can frame our hypotheses as follows:
I will be working with the most recent wave of the World Values Survey, Wave 7, which was conducted from 2017 to 2022. The data is freely available for non-profit purposes. It must be cited properly and not re-distributed (Haerpfer et al., 2022).
Representative samples of the population aged 18 and above were collected from 59 countries. Data was mostly collected by interviewing respondents at their homes (“WVS Database”, 2022).
I am using the most recent version of Wave 7 released in May 2022. The most final version of the dataset will be released in Oct 2022. I may update the dataset below at a later date, if time permits.
I will indicate my comments in each code chunk to keep track of my progress.
# read in dataset.
wvs <- read_csv("~/Desktop/2022_Fall/DACSS 603/General/Final Project/WVS/4. Data/WVS_Cross-National_Wave_7_csv_v4_0.csv", show_col_types = FALSE) %>% select("A_YEAR", "B_COUNTRY_ALPHA", "Q_MODE", "G_TOWNSIZE", "H_SETTLEMENT", "H_URBRURAL", "O1_LONGITUDE", "O2_LATITUDE", "Q1", "Q2", "Q3", "Q6", "Q46", "Q47", "Q48", "Q49", "Q50", "Q57", "Q171", "Q260", "Q262", "Q263", "Q269", "Q270", "Q271", "Q273", "Q274", "Q275", "Q279", "Q288", "Q288R", "Q289", "Q290", "I_WOMJOB", "I_WOMPOL", "I_WOMEDU", "I_HOMOLIB", "I_ABORTLIB", "womenparl")
The dataset originally had 552 columns. I have selected a subset of columns based on variables used in past papers, as well as some variables I am interested to examine. These include place/area of residence, literacy, demographics, importance of various social aspects, happiness and wellbeing indicators, trust, religiosity, equality of gender/sexual orientation and abortion attitudes.
I will first create a dummy variable for Global North/South. The Global South comprises low- and lower-middle income countries, as defined by the World Bank (“World Bank Country and Lending Groups”, 2022). Global South countries surveyed include Ethiopia, Philippines, Indonesia, Bangladesh, Iran, Kenya, Bolivia, Kyrgyzstan, Lebanon, Tajikistan, Tunisia, Ukraine, Mongolia, Morocco, Egypt, Myanmar, Vietnam, Nicaragua, Zimbabwe, Nigeria and Pakistan.
# create dummy.
wvs <- mutate(wvs, NS = case_when(B_COUNTRY_ALPHA == "ETH" | B_COUNTRY_ALPHA == "PHL" | B_COUNTRY_ALPHA == "IDN" | B_COUNTRY_ALPHA == "BGD" | B_COUNTRY_ALPHA == "IRN" | B_COUNTRY_ALPHA == "KEN" | B_COUNTRY_ALPHA == "BOL" | B_COUNTRY_ALPHA == "KGZ" | B_COUNTRY_ALPHA == "LBN" | B_COUNTRY_ALPHA == "TJK" | B_COUNTRY_ALPHA == "TUN" | B_COUNTRY_ALPHA == "MOR" | B_COUNTRY_ALPHA == "UKR" | B_COUNTRY_ALPHA == "MNG" | B_COUNTRY_ALPHA == "EGY" | B_COUNTRY_ALPHA == "MMR" | B_COUNTRY_ALPHA == "VNM" | B_COUNTRY_ALPHA == "NIC" | B_COUNTRY_ALPHA == "ZWE" | B_COUNTRY_ALPHA == "NGA" | B_COUNTRY_ALPHA == "PAK" ~ "1"))
# replace "NA" with "O" (for Global North).
wvs$NS <- replace_na(wvs$NS, "0")
# change to factor.
wvs$NS <- as.factor(wvs$NS)
# check counts of levels.
wvs %>% select(NS) %>% summary()
NS
0:59178
1:28644
# sanity check.
wvs %>% filter(B_COUNTRY_ALPHA == "ETH" | B_COUNTRY_ALPHA == "PHL" | B_COUNTRY_ALPHA == "IDN" | B_COUNTRY_ALPHA == "BGD" | B_COUNTRY_ALPHA == "IRN" | B_COUNTRY_ALPHA == "KEN" | B_COUNTRY_ALPHA == "BOL" | B_COUNTRY_ALPHA == "KGZ" | B_COUNTRY_ALPHA == "LBN" | B_COUNTRY_ALPHA == "TJK" | B_COUNTRY_ALPHA == "TUN" | B_COUNTRY_ALPHA == "MOR" | B_COUNTRY_ALPHA == "UKR" | B_COUNTRY_ALPHA == "MNG" | B_COUNTRY_ALPHA == "EGY" | B_COUNTRY_ALPHA == "MMR" | B_COUNTRY_ALPHA == "VNM" | B_COUNTRY_ALPHA == "NIC" | B_COUNTRY_ALPHA == "ZWE" | B_COUNTRY_ALPHA == "NGA" | B_COUNTRY_ALPHA == "PAK") %>% nrow()
[1] 28644
# rename columns.
names(wvs) <- c("A_YEAR", "B_COUNTRY_ALPHA", "Q_MODE", "G_TOWNSIZE", "H_SETTLEMENT", "H_URBRURAL", "Long", "Lat", "FamImpt", "FriendsImpt", "LeisureImpt", "ReligionImpt", "Happiness", "PerceivedHealth", "FOC", "LS", "FS", "Trust", "AttendReligious", "Sex", "Age", "Immigrant", "Citizen", "HHSize", "Parents", "Married", "Kids", "Edu", "Job", "Income", "IncomeR", "Religion", "Race", "I_WOMJOB", "I_WOMPOL", "I_WOMEDU", "I_HOMOLIB", "I_ABORTLIB", "womenparl", "NS")
The sanity check shows that the creation of the dummy was successful, with 28,644 datapoints from the Global South.
tibble [87,822 × 40] (S3: tbl_df/tbl/data.frame)
$ A_YEAR : num [1:87822] 2019 2019 2019 2019 2019 ...
$ B_COUNTRY_ALPHA: chr [1:87822] "CYP" "CYP" "CYP" "CYP" ...
$ Q_MODE : num [1:87822] 2 2 2 2 2 2 2 2 2 2 ...
$ G_TOWNSIZE : num [1:87822] 6 6 6 6 6 6 6 6 6 6 ...
$ H_SETTLEMENT : num [1:87822] 4 4 4 4 4 4 4 4 4 4 ...
$ H_URBRURAL : num [1:87822] 1 1 1 1 1 1 1 1 1 1 ...
$ Long : num [1:87822] 34.8 34.8 34.8 34.8 34.8 ...
$ Lat : num [1:87822] 32.4 32.4 32.4 32.4 32.5 ...
$ FamImpt : num [1:87822] 1 1 1 1 1 1 2 1 1 1 ...
$ FriendsImpt : num [1:87822] 1 3 2 2 NA 1 2 2 1 2 ...
$ LeisureImpt : num [1:87822] 1 1 1 1 2 1 2 1 1 2 ...
$ ReligionImpt : num [1:87822] 1 1 1 1 1 3 2 1 3 1 ...
$ Happiness : num [1:87822] 2 1 2 2 3 2 2 1 2 3 ...
$ PerceivedHealth: num [1:87822] 4 2 1 3 3 1 1 1 1 4 ...
$ FOC : num [1:87822] 10 5 5 5 3 7 5 5 5 NA ...
$ LS : num [1:87822] 8 7 9 5 5 8 4 7 8 9 ...
$ FS : num [1:87822] 8 5 5 5 5 7 3 5 8 4 ...
$ Trust : num [1:87822] 2 2 2 2 2 2 2 2 2 2 ...
$ AttendReligious: num [1:87822] 7 2 3 2 4 4 2 4 4 2 ...
$ Sex : num [1:87822] 1 2 2 2 2 1 1 2 1 2 ...
$ Age : num [1:87822] 61 61 42 64 52 39 61 25 36 77 ...
$ Immigrant : num [1:87822] 1 1 2 1 2 2 1 1 1 1 ...
$ Citizen : num [1:87822] 1 1 1 1 1 1 1 1 1 1 ...
$ HHSize : num [1:87822] 2 4 6 2 8 1 2 3 2 3 ...
$ Parents : num [1:87822] 1 1 1 1 1 1 1 1 1 1 ...
$ Married : num [1:87822] 1 1 1 1 5 3 1 1 2 1 ...
$ Kids : num [1:87822] 2 2 4 2 3 2 2 1 2 3 ...
$ Edu : num [1:87822] 1 1 4 3 3 3 2 3 1 0 ...
$ Job : num [1:87822] 1 1 5 1 1 1 1 7 1 5 ...
$ Income : num [1:87822] 5 5 3 5 3 5 3 5 7 3 ...
$ IncomeR : num [1:87822] 2 2 1 2 1 2 1 2 2 1 ...
$ Religion : num [1:87822] 3 3 3 3 3 3 3 3 3 3 ...
$ Race : num [1:87822] 196001 196001 196001 196001 196001 ...
$ I_WOMJOB : num [1:87822] 0.75 0.5 1 0.75 0.5 0.5 0.5 0.75 0.5 0.5 ...
$ I_WOMPOL : num [1:87822] 0.66 NA 1 0.66 NA 0.33 0.66 0.66 NA 0.33 ...
$ I_WOMEDU : num [1:87822] 0.66 1 1 0.66 0.33 1 0.66 0.66 0.66 0.66 ...
$ I_HOMOLIB : num [1:87822] 0 0 0.444 0 0 ...
$ I_ABORTLIB : num [1:87822] 0 0 0 0 0 ...
$ womenparl : num [1:87822] 17.9 17.9 17.9 17.9 17.9 ...
$ NS : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
A_YEAR B_COUNTRY_ALPHA Q_MODE G_TOWNSIZE
Min. :2017 Length:87822 Min. :1.000 Min. :1.000
1st Qu.:2018 Class :character 1st Qu.:1.000 1st Qu.:3.000
Median :2018 Mode :character Median :2.000 Median :6.000
Mean :2019 Mean :1.734 Mean :5.312
3rd Qu.:2020 3rd Qu.:2.000 3rd Qu.:8.000
Max. :2022 Max. :5.000 Max. :8.000
NA's :1274
H_SETTLEMENT H_URBRURAL Long Lat
Min. :1.000 Min. :1.000 Min. :-156.34 Min. :-43.26
1st Qu.:2.000 1st Qu.:1.000 1st Qu.: 7.66 1st Qu.: 6.99
Median :3.000 Median :1.000 Median : 39.94 Median : 24.75
Mean :3.066 Mean :1.318 Mean : 36.16 Mean : 21.35
3rd Qu.:5.000 3rd Qu.:2.000 3rd Qu.: 100.27 3rd Qu.: 35.70
Max. :5.000 Max. :2.000 Max. : 156.89 Max. :100.35
NA's :207 NA's :32 NA's :27098 NA's :27094
FamImpt FriendsImpt LeisureImpt ReligionImpt
Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
1st Qu.:1.000 1st Qu.:1.000 1st Qu.:1.000 1st Qu.:1.000
Median :1.000 Median :2.000 Median :2.000 Median :2.000
Mean :1.112 Mean :1.721 Mean :1.788 Mean :1.938
3rd Qu.:1.000 3rd Qu.:2.000 3rd Qu.:2.000 3rd Qu.:3.000
Max. :4.000 Max. :4.000 Max. :4.000 Max. :4.000
NA's :146 NA's :289 NA's :473 NA's :831
Happiness PerceivedHealth FOC LS
Min. :1.000 Min. :1.000 Min. : 1.000 Min. : 1.000
1st Qu.:1.000 1st Qu.:2.000 1st Qu.: 6.000 1st Qu.: 6.000
Median :2.000 Median :2.000 Median : 7.000 Median : 7.000
Mean :1.857 Mean :2.194 Mean : 7.203 Mean : 7.043
3rd Qu.:2.000 3rd Qu.:3.000 3rd Qu.: 9.000 3rd Qu.: 9.000
Max. :4.000 Max. :5.000 Max. :10.000 Max. :10.000
NA's :574 NA's :254 NA's :800 NA's :393
FS Trust AttendReligious Sex
Min. : 1.000 Min. :1.000 Min. :1.000 Min. :1.000
1st Qu.: 5.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:1.000
Median : 6.000 Median :2.000 Median :4.000 Median :2.000
Mean : 6.172 Mean :1.765 Mean :4.139 Mean :1.525
3rd Qu.: 8.000 3rd Qu.:2.000 3rd Qu.:6.000 3rd Qu.:2.000
Max. :10.000 Max. :2.000 Max. :7.000 Max. :2.000
NA's :545 NA's :1198 NA's :1034 NA's :62
Age Immigrant Citizen HHSize
Min. : 16.00 Min. :1.000 Min. :1.000 Min. : 1.000
1st Qu.: 29.00 1st Qu.:1.000 1st Qu.:1.000 1st Qu.: 2.000
Median : 41.00 Median :1.000 Median :1.000 Median : 4.000
Mean : 42.85 Mean :1.059 Mean :1.022 Mean : 3.945
3rd Qu.: 55.00 3rd Qu.:1.000 3rd Qu.:1.000 3rd Qu.: 5.000
Max. :103.00 Max. :2.000 Max. :2.000 Max. :63.000
NA's :339 NA's :344 NA's :5164 NA's :852
Parents Married Kids Edu Job
Min. :1.000 Min. :1.00 Min. : 0.000 Min. :0.000 Min. :1.00
1st Qu.:1.000 1st Qu.:1.00 1st Qu.: 0.000 1st Qu.:2.000 1st Qu.:1.00
Median :1.000 Median :1.00 Median : 2.000 Median :3.000 Median :3.00
Mean :1.353 Mean :2.65 Mean : 1.766 Mean :3.546 Mean :3.13
3rd Qu.:2.000 3rd Qu.:5.00 3rd Qu.: 3.000 3rd Qu.:5.000 3rd Qu.:5.00
Max. :4.000 Max. :6.00 Max. :24.000 Max. :8.000 Max. :8.00
NA's :1438 NA's :504 NA's :1201 NA's :818 NA's :1143
Income IncomeR Religion Race
Min. : 1.000 Min. :1.000 Min. :0.000 Min. : 20001
1st Qu.: 3.000 1st Qu.:1.000 1st Qu.:1.000 1st Qu.:158002
Median : 5.000 Median :2.000 Median :3.000 Median :410004
Mean : 4.859 Mean :1.841 Mean :3.005 Mean :416252
3rd Qu.: 6.000 3rd Qu.:2.000 3rd Qu.:5.000 3rd Qu.:630001
Max. :10.000 Max. :3.000 Max. :9.000 Max. :862005
NA's :2330 NA's :2330 NA's :2485 NA's :9486
I_WOMJOB I_WOMPOL I_WOMEDU I_HOMOLIB
Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.000
1st Qu.:0.2500 1st Qu.:0.3300 1st Qu.:0.6600 1st Qu.:0.000
Median :0.5000 Median :0.6600 Median :0.6600 Median :0.111
Mean :0.5075 Mean :0.5427 Mean :0.6649 Mean :0.316
3rd Qu.:0.7500 3rd Qu.:0.6600 3rd Qu.:1.0000 3rd Qu.:0.556
Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.000
NA's :648 NA's :2222 NA's :1250 NA's :5691
I_ABORTLIB womenparl NS
Min. :0.0000 Min. : 3.38 0:59178
1st Qu.:0.0000 1st Qu.:17.39 1:28644
Median :0.1111 Median :21.88
Mean :0.2659 Mean :23.77
3rd Qu.:0.4444 3rd Qu.:28.99
Max. :1.0000 Max. :53.08
NA's :1979 NA's :5448
Variable | Stats / Values | Freqs (% of Valid) | Graph | Missing | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A_YEAR [numeric] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
B_COUNTRY_ALPHA [character] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Q_MODE [numeric] |
|
|
0 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
G_TOWNSIZE [numeric] |
|
|
1274 (1.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
H_SETTLEMENT [numeric] |
|
|
207 (0.2%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
H_URBRURAL [numeric] |
|
|
32 (0.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Long [numeric] |
|
5482 distinct values | 27098 (30.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Lat [numeric] |
|
3911 distinct values | 27094 (30.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
FamImpt [numeric] |
|
|
146 (0.2%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
FriendsImpt [numeric] |
|
|
289 (0.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
LeisureImpt [numeric] |
|
|
473 (0.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ReligionImpt [numeric] |
|
|
831 (0.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Happiness [numeric] |
|
|
574 (0.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
PerceivedHealth [numeric] |
|
|
254 (0.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
FOC [numeric] |
|
|
800 (0.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
LS [numeric] |
|
|
393 (0.4%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
FS [numeric] |
|
|
545 (0.6%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Trust [numeric] |
|
|
1198 (1.4%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
AttendReligious [numeric] |
|
|
1034 (1.2%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sex [numeric] |
|
|
62 (0.1%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Age [numeric] |
|
85 distinct values | 339 (0.4%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Immigrant [numeric] |
|
|
344 (0.4%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Citizen [numeric] |
|
|
5164 (5.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
HHSize [numeric] |
|
33 distinct values | 852 (1.0%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Parents [numeric] |
|
|
1438 (1.6%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Married [numeric] |
|
|
504 (0.6%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Kids [numeric] |
|
23 distinct values | 1201 (1.4%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Edu [numeric] |
|
|
818 (0.9%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Job [numeric] |
|
|
1143 (1.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Income [numeric] |
|
|
2330 (2.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
IncomeR [numeric] |
|
|
2330 (2.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Religion [numeric] |
|
|
2485 (2.8%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Race [numeric] |
|
373 distinct values | 9486 (10.8%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I_WOMJOB [numeric] |
|
5 distinct values | 648 (0.7%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I_WOMPOL [numeric] |
|
4 distinct values | 2222 (2.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I_WOMEDU [numeric] |
|
4 distinct values | 1250 (1.4%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I_HOMOLIB [numeric] |
|
10 distinct values | 5691 (6.5%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I_ABORTLIB [numeric] |
|
10 distinct values | 1979 (2.3%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
womenparl [numeric] |
|
54 distinct values | 5448 (6.2%) | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
NS [factor] |
|
|
0 (0.0%) |
Generated by summarytools 1.0.1 (R version 4.2.1)
2022-10-13
The dataset has 87,822 rows, each representing one participant, and 40 columns. All variables seem to be labelled correctly.
Referring to the codebook, these are some noteworthy descriptive statistics:
Respondents tended to come from more urban settings (H_URBRURAL).
On average, family was perceived as more important than friends, leisure time and religion (FamImpt, FriendsImpt, LeisureImpt, ReligionImpt).
On average, people were “quite happy” (the second-highest option for Happiness).
Life satisfaction tended to be 7/10 (LS).
People tended to err on the side of caution when it came to trusting others (Trust).
Households had 4 people on average, with maximum household size being 63 (HHSize)!
The interquartile range for education was lower secondary to short-cycle tertiary education (Edu).
For the survey variables (FamImpt to I_ABORTLIB), missing data ranged from 0.2% to 10.8%, which is acceptable.
67.4% of the respondents came from the Global North (NS).
Let’s check if life satisfaction and happiness differ between the Global North and South.
Welch Two Sample t-test
data: Happiness by NS
t = 4.1272, df = 49878, p-value = 3.677e-05
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
0.01164249 0.03270103
sample estimates:
mean in group 0 mean in group 1
1.863945 1.841774
Welch Two Sample t-test
data: LS by NS
t = 13.283, df = 47990, p-value < 2.2e-16
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
0.1962795 0.2642322
sample estimates:
mean in group 0 mean in group 1
7.117885 6.887629
The Welch’s two-sample t-tests show that there is a significant difference in happiness and life satisfaction between the Global North and South, where the former has higher mean values for both, p < .001. This echoes Alba (2019)’s finding on happiness and adds new knowledge to the literature regarding life satisfaction.
We can also create graphs to visualize the latitude and longitude of countries in the Global North and Global South.
Warning: Removed 27098 rows containing non-finite values (stat_bin2d).
The graph above shows that the Global North (“0”) and South (“1”) are not neatly divided by physical location, due to the existence of developed countries physically located in the South (e.g., South Korea) and developing countries physically located in the North (e.g., Ukraine).
Addai, I., Opoku-Agyeman, C., & Amanfu, S. (2013). Exploring Predictors of Subjective Well-Being in Ghana: A Micro-Level Study. Journal Of Happiness Studies, 15(4), 869-890.
Alba, C. (2019). A Data Analysis of the World Happiness Index and its Relation to the North-South Divide. Undergraduate Economic Review, 16(1).
Haerpfer, C., Inglehart, R., Moreno, A., Welzel, C., Kizilova, K., Diez-Medrano J., M. Lagos, P. Norris, E. Ponarin & B. Puranen (eds.). 2022. World Values Survey: Round Seven - Country-Pooled Datafile Version 4.0. Madrid, Spain & Vienna, Austria: JD Systems Institute & WVSA Secretariat.
Ngamaba, K. (2016). Happiness and life satisfaction in Rwanda. Journal Of Psychology In Africa, 26(5), 407-414.
World Bank Country and Lending Groups. World Bank Data Help Desk. (2022). Retrieved from https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups.
WVS Database. World Values Survey. (2022). Retrieved from https://www.worldvaluessurvey.org/WVSDocumentationWV7.jsp.
---
title: "Final Project Proposal"
author: "Saaradhaa M"
description: "Part 1"
date: "10/09/2022"
format:
html:
toc: true
code-fold: true
code-copy: true
code-tools: true
categories:
- finalpart1
---
```{r}
#| label: setup
#| warning: false
# load libraries.
library(tidyverse)
library(ggplot2)
library(summarytools)
knitr::opts_chunk$set(echo = TRUE)
```
## Introduction
Prior research literature in the social sciences has continually stressed the need for more research on the Global South. However, few papers actually focus on it. Hence, I am interested to learn more about this region. A data source that lends itself useful for this is the World Values Survey, a global survey with an easily accessible database.
I am specifically interested in understanding what drives subjective well-being, which can be interpreted via happiness and life satisfaction (Addai et al., 2013).
::: callout-tip
## Research Questions
A. What predicts happiness and life satisfaction in the Global South?
B. Do predictors of happiness and life satisfaction differ between the Global North and South?
:::
This project will be useful to better understand motivations and desires in the Global South, reduce inter-cultural tensions and enhance cross-cultural cohesion. Governments can also benefit from this research in terms of policy prioritization to maximize citizens' well-being.
## Hypothesis
Past researchers have studied happiness and life satisfaction in the Global South via the World Values Survey (Addai et al., 2013; Ngamaba, 2016). The studies focused on Ghana and Rwanda respectively. The common predictors of happiness and life satisfaction across both countries were satisfaction with **health** and **income**.
To the best of my knowledge, few studies comparing well-being in the Global North and South exist. Alba (2019) found that happiness was generally greater in the Global North than the Global South, and indicated that future research should attempt to cover the factors behind this. I think happiness and well-being in the Global North may depend on more subjective measures, given that health and income-related issues should be relatively more accounted for.
Given the above, we can frame our hypotheses as follows:
::: callout-tip
## H~0A~
Health and financial satisfaction [will not]{.underline} be statistically significant predictors of happiness and life satisfaction in the Global South.
:::
::: callout-tip
## H~1A~
Health and financial satisfaction [will]{.underline} be statistically significant predictors of happiness and life satisfaction in the Global South.
:::
::: callout-tip
## H~0B~
Predictors of happiness and life satisfaction [will not]{.underline} differ between the Global North and South.
:::
::: callout-tip
## H~1B~
Predictors of happiness and life satisfaction [will]{.underline} differ between the Global North and South.
:::
# Reading In Dataset
I will be working with the most recent wave of the World Values Survey, Wave 7, which was conducted from 2017 to 2022. The data is freely available for non-profit purposes. It must be cited properly and not re-distributed (Haerpfer et al., 2022).
Representative samples of the population aged 18 and above were collected from 59 countries. Data was mostly collected by interviewing respondents at their homes ("WVS Database", 2022).
I am using the most recent version of Wave 7 released in May 2022. The most final version of the dataset will be released in Oct 2022. I may update the dataset below at a later date, if time permits.
I will indicate my comments in each code chunk to keep track of my progress.
```{r}
#| label: read in
# read in dataset.
wvs <- read_csv("~/Desktop/2022_Fall/DACSS 603/General/Final Project/WVS/4. Data/WVS_Cross-National_Wave_7_csv_v4_0.csv", show_col_types = FALSE) %>% select("A_YEAR", "B_COUNTRY_ALPHA", "Q_MODE", "G_TOWNSIZE", "H_SETTLEMENT", "H_URBRURAL", "O1_LONGITUDE", "O2_LATITUDE", "Q1", "Q2", "Q3", "Q6", "Q46", "Q47", "Q48", "Q49", "Q50", "Q57", "Q171", "Q260", "Q262", "Q263", "Q269", "Q270", "Q271", "Q273", "Q274", "Q275", "Q279", "Q288", "Q288R", "Q289", "Q290", "I_WOMJOB", "I_WOMPOL", "I_WOMEDU", "I_HOMOLIB", "I_ABORTLIB", "womenparl")
```
The dataset originally had 552 columns. I have selected a subset of columns based on variables used in past papers, as well as some variables I am interested to examine. These include place/area of residence, literacy, demographics, importance of various social aspects, happiness and wellbeing indicators, trust, religiosity, equality of gender/sexual orientation and abortion attitudes.
I will first create a dummy variable for Global North/South. The Global South comprises low- and lower-middle income countries, as defined by the [World Bank](https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups) ("World Bank Country and Lending Groups", 2022). Global South countries surveyed include Ethiopia, Philippines, Indonesia, Bangladesh, Iran, Kenya, Bolivia, Kyrgyzstan, Lebanon, Tajikistan, Tunisia, Ukraine, Mongolia, Morocco, Egypt, Myanmar, Vietnam, Nicaragua, Zimbabwe, Nigeria and Pakistan.
```{r}
#| label: create dummy
# create dummy.
wvs <- mutate(wvs, NS = case_when(B_COUNTRY_ALPHA == "ETH" | B_COUNTRY_ALPHA == "PHL" | B_COUNTRY_ALPHA == "IDN" | B_COUNTRY_ALPHA == "BGD" | B_COUNTRY_ALPHA == "IRN" | B_COUNTRY_ALPHA == "KEN" | B_COUNTRY_ALPHA == "BOL" | B_COUNTRY_ALPHA == "KGZ" | B_COUNTRY_ALPHA == "LBN" | B_COUNTRY_ALPHA == "TJK" | B_COUNTRY_ALPHA == "TUN" | B_COUNTRY_ALPHA == "MOR" | B_COUNTRY_ALPHA == "UKR" | B_COUNTRY_ALPHA == "MNG" | B_COUNTRY_ALPHA == "EGY" | B_COUNTRY_ALPHA == "MMR" | B_COUNTRY_ALPHA == "VNM" | B_COUNTRY_ALPHA == "NIC" | B_COUNTRY_ALPHA == "ZWE" | B_COUNTRY_ALPHA == "NGA" | B_COUNTRY_ALPHA == "PAK" ~ "1"))
# replace "NA" with "O" (for Global North).
wvs$NS <- replace_na(wvs$NS, "0")
# change to factor.
wvs$NS <- as.factor(wvs$NS)
# check counts of levels.
wvs %>% select(NS) %>% summary()
# sanity check.
wvs %>% filter(B_COUNTRY_ALPHA == "ETH" | B_COUNTRY_ALPHA == "PHL" | B_COUNTRY_ALPHA == "IDN" | B_COUNTRY_ALPHA == "BGD" | B_COUNTRY_ALPHA == "IRN" | B_COUNTRY_ALPHA == "KEN" | B_COUNTRY_ALPHA == "BOL" | B_COUNTRY_ALPHA == "KGZ" | B_COUNTRY_ALPHA == "LBN" | B_COUNTRY_ALPHA == "TJK" | B_COUNTRY_ALPHA == "TUN" | B_COUNTRY_ALPHA == "MOR" | B_COUNTRY_ALPHA == "UKR" | B_COUNTRY_ALPHA == "MNG" | B_COUNTRY_ALPHA == "EGY" | B_COUNTRY_ALPHA == "MMR" | B_COUNTRY_ALPHA == "VNM" | B_COUNTRY_ALPHA == "NIC" | B_COUNTRY_ALPHA == "ZWE" | B_COUNTRY_ALPHA == "NGA" | B_COUNTRY_ALPHA == "PAK") %>% nrow()
# rename columns.
names(wvs) <- c("A_YEAR", "B_COUNTRY_ALPHA", "Q_MODE", "G_TOWNSIZE", "H_SETTLEMENT", "H_URBRURAL", "Long", "Lat", "FamImpt", "FriendsImpt", "LeisureImpt", "ReligionImpt", "Happiness", "PerceivedHealth", "FOC", "LS", "FS", "Trust", "AttendReligious", "Sex", "Age", "Immigrant", "Citizen", "HHSize", "Parents", "Married", "Kids", "Edu", "Job", "Income", "IncomeR", "Religion", "Race", "I_WOMJOB", "I_WOMPOL", "I_WOMEDU", "I_HOMOLIB", "I_ABORTLIB", "womenparl", "NS")
```
The sanity check shows that the creation of the dummy was successful, with 28,644 datapoints from the Global South.
# Exploratory Analysis of Data
```{r}
# check rows, columns and variable types.
str(wvs)
# check basic descriptive statistics.
summary(wvs)
print(dfSummary(wvs, varnumbers = FALSE, plain.ascii = FALSE, graph.magnif = 0.30, style = "grid", valid.col = FALSE),
method = 'render', table.classes = 'table-condensed')
```
The dataset has 87,822 rows, each representing one participant, and 40 columns. All variables seem to be labelled correctly.
Referring to the codebook, these are some noteworthy descriptive statistics:
- Respondents tended to come from more urban settings (H_URBRURAL).
- On average, family was perceived as more important than friends, leisure time and religion (FamImpt, FriendsImpt, LeisureImpt, ReligionImpt).
- On average, people were "quite happy" (the second-highest option for Happiness).
- Life satisfaction tended to be 7/10 (LS).
- People tended to err on the side of caution when it came to trusting others (Trust).
- Households had 4 people on average, with maximum household size being 63 (HHSize)!
- The interquartile range for education was lower secondary to short-cycle tertiary education (Edu).
- For the survey variables (FamImpt to I_ABORTLIB), missing data ranged from 0.2% to 10.8%, which is acceptable.
- 67.4% of the respondents came from the Global North (NS).
Let's check if life satisfaction and happiness differ between the Global North and South.
```{r}
t.test(Happiness ~ NS, wvs)
t.test(LS ~ NS, wvs)
```
The Welch's two-sample t-tests show that there is a significant difference in happiness and life satisfaction between the Global North and South, where the former has higher mean values for both, *p* \< .001. This echoes Alba (2019)'s finding on happiness and adds new knowledge to the literature regarding life satisfaction.
We can also create graphs to visualize the latitude and longitude of countries in the Global North and Global South.
```{r}
ggplot(wvs) + geom_bin2d(mapping = aes(x = Long, y = Lat)) + facet_wrap(vars(NS))
```
The graph above shows that the Global North ("0") and South ("1") are not neatly divided by physical location, due to the existence of developed countries physically located in the South (e.g., South Korea) and developing countries physically located in the North (e.g., Ukraine).
# Bibliography
Addai, I., Opoku-Agyeman, C., & Amanfu, S. (2013). Exploring Predictors of Subjective Well-Being in Ghana: A Micro-Level Study. *Journal Of Happiness Studies*, *15*(4), 869-890.
Alba, C. (2019). A Data Analysis of the World Happiness Index and its Relation to the North-South Divide. *Undergraduate Economic Review*, *16*(1).
Haerpfer, C., Inglehart, R., Moreno, A., Welzel, C., Kizilova, K., Diez-Medrano J., M. Lagos, P. Norris, E. Ponarin & B. Puranen (eds.). 2022. World Values Survey: Round Seven - Country-Pooled Datafile Version 4.0. Madrid, Spain & Vienna, Austria: JD Systems Institute & WVSA Secretariat.
Ngamaba, K. (2016). Happiness and life satisfaction in Rwanda. *Journal Of Psychology In Africa*, *26*(5), 407-414.
*World Bank Country and Lending Groups*. World Bank Data Help Desk. (2022). Retrieved from https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups.
*WVS Database*. World Values Survey. (2022). Retrieved from https://www.worldvaluessurvey.org/WVSDocumentationWV7.jsp.