This is my final project for DACSS 601
The dataset I have chosen for the final project is the Social Progress Index report containing data from 2011-to 2021. The mission of the Social Progress Index is to measure if people have what they need to adequately support their well-being and flourish in society. They look at if people have their basic needs met, are well-nourished, feel safe, are discriminated against, etc. There are a lot of variables within this dataset, all part of three overarching categories: Basic Human Needs, Foundations of Wellbeing, and Opportunity. The categories’ scores are the averages of all their components, and the overall Social Progress score for each country is the average of the three. I think it’s important to look at what each variable means to get a sense of what calculations and factors are being entered into the dataset.
Nutrition and Basic Medical Care (numeric): Average of its components.
Undernourishment (numeric): The probability that a random individual from the population consumes an amount of calories that is insufficient for a healthy life.
Maternal mortality rate (numeric): Maternal deaths per 100,000 live births in women aged 10-54.
Child mortality rate (numeric): Probability of dying between birth and 5 years old, per 1,000 live births.
Child stunting (numeric): Prevalence of stunting in children under 5, as measured by the summary exposure value for child stunting, on a scale from 0% to 100%.
Deaths from infectious diseases (numeric): Age-standardized mortality rate from deaths from infectious diseases per 100,000 people.
Water and Sanitation (numeric): Average of its components.
Unsafe water, sanitation, and hygiene attributable deaths (numeric): Age-standardized death rate attributable to these factors per 100,000 people.
Access to improved water source (numeric): Proportion of population with access to improved water sources.
Access to improved sanitation (numeric): Proportion of population with access to improved toilet types.
Shelter (numeric): Average of its components.
Access to electricity (numeric): Percentage of population with access to electricity.
Household air pollution attributable deaths (numeric): Age-standardized deaths from household air pollution from solid fuels per 100,000.
Usage of clean fuels and technology for cooking (numeric): The proportion of the population that primarily uses clean cooking fuels and technologies for cooking.
Dissatisfaction with housing affordability (numeric): Percentage of respondents that answered “no” to the question, “In the city or area where you live, are you satisfied or dissatisfied with the availability of good, affordable housing?”.
Personal safety (numeric): Average of its components.
Deaths from interpersonal violence (numeric): Age-standardized deaths rate (per 100,000 people) from interpersonal violence, defined as death or disability from the intentional use of physical force or power, threatened or actual, from another civilian person or group.
Political killings and torture (ordinal): Physical violence index scaled from 0 to 1 that is based on indicators that reflect violence committed by government agents and that are not directly referring to elections.
Transportation-related fatalities (numeric): Age-standardized rate of deaths per 100,000 people due to injuries related to transportation.
Perceived criminality (ordinal): Measured on a scale of 1 (majority of other citizens can be trusted; very low levels of domestic security) to 5 (very high level of distrust; people are extremely cautious in their dealings with others; large number of gated communities, high prevalence of security guards).
Access to Basic Knowledge (numeric): Average of its components.
Women with no schooling (numeric): Proportion of women (age-standardized) with no schooling.
Primary school enrollment (numeric): Percentage of the total population of official primary school age that is actually enrolled in any level of education.
Secondary school attainment (numeric): Percent of the population ages 25 and older with at least some secondary education.
Gender parity in secondary attainment (numeric): The absolute deviation from parity (=1) in secondary education attainment of women and men.
Equal access to quality education (ordinal): Country experts’ aggregated evaluation of the question, “To what extent is high quality basic education guaranteed to all, sufficient to enable them to exercise their basic rights as adult citizens?” measured on a scale of 0 (Unequal) to 4 (Equal).
Access to Information and Communications (numeric): Average of its components.
Mobile telephone subscriptions (numeric): The number of mobile telephone subscriptions per 100 inhabitants.
Internet users (numeric): The estimated number of Internet users out of the total population.
Access to online governance (numeric): The availability of e-participation tools on national government portals.
Media censorship (ordinal): Country experts’ aggregated evaluation of the question, “Does the government directly or indirectly attempt to censor the print or broadcast media?” measured on a scale of 0 (direct and routine attempts) to 4 (attempts are rare).
Health and Wellness (numeric): Average of its components.
Life expectancy at 60 (numeric): The average number of years that a person 60 to 64 years old could expect to live.
Premature deaths from non-communicable diseases (numeric): Mortality rate among people aged 30-70 from non-communicable diseases.
Access to essential services (numeric): The universal health coverage (UHC) index measures the coverage of 9 tracer interventions and risk-standardized death rates from 32 causes amenable to personal healthcare.
Equal access to quality healthcare (ordinal): Country experts’ aggregated evaluation of the question, “To what extent is high quality basic healthcare guaranteed to all, sufficient to enable them to exercise their basic political rights as adult citizens?” measured on a scale of 0 (Extreme) to 4 (Equal).
Environmental Quality (numeric): Average of all its components.
Outdoor air pollution attributable deaths (numeric): The number of deaths resulting from ambient particulate matter pollution per 100,000 people, age-adjusted.
Deaths from lead exposure (numeric): Age-standardized death rate from lead exposure (per 100,000 people).
Particulate matter pollution (numeric): Population-weighted mean levels of annual exposure to suspended particles.
Species protection (ordinal): An index of how well a country’s terrestrial protected areas overlap with the ranges of its vertebrate, invertebrate, and plant species. A score of 100 indicates full coverage of all species’ ranges by a country’s protected areas, and a score of 0 indicates no overlap.
Personal Rights (numeric): Average of all its components.
Political rights (ordinal): An evaluation of three subcategories of political rights: electoral process, political pluralism and participation, and functioning of government on a scale from 0 (no political rights) to 40 (full political rights).
Freedom of expression (ordinal): Country experts’ aggregated evaluation of the question, “To what extent does government respect press & media freedom, the freedom of ordinary people to discuss political matters at home and in the public sphere, as well as the freedom of academic and cultural expression?” on a scale of 0 (no freedom) to 4 (full freedom).
Freedom of religion (ordinal): Country experts’ aggregated evaluation of the question, “Is there freedom of religion?” measured on a scale of 0 (hardly any) to 4 (full freedom).
Access to justice (ordinal): Country experts’ aggregated evaluation of the question, “Do citizens enjoy secure and effective access to justice?” converted to a scale of 0 (access is nonexistent) to 1 (access is almost always observed).
Property rights for women (ordinal): Country experts’ aggregated evaluation of the question, “Do women enjoy the right to private property?” measured on a scale of 0 (not at all) to 5 (yes).
Personal Freedom and Choice (numeric): Average of all its components.
Vulnerable employment (numeric): Contributing family workers and own-account workers as a percentage of total employment.
Early marriage (numeric): The percentage of women aged 15-19 years who are married or in-union.
Satisfied demand for contraception (numeric): The percentage of total demand for family planning among married or in-union women aged 15 to 49.
Perception of corruption (ordinal): The perceived level of public sector corruption based on expert opinion, measured on a scale from 0 (highly corrupt) to 100 (very clean).
Young people not in education, employment, or training (numeric): The proportion of youth (15-24) who are not in employment and not in education or training.
Inclusiveness (numeric): Average of all its components.
Acceptance of gays and lesbians (numeric): The percentage of respondents answering yes to the question, “Is the city or area where you live a good place or not a good place to live for gay or lesbian people?”
Discrimination and violence against minorities (ordinal): Discrimination, powerlessness, ethnic violence, communal violence, sectarian violence, and religious violence, measured on a scale from 0 (low pressures) to 10 (very high pressures).
Equality of political power by gender (ordinal): Country experts’ aggregated evaluation of the question, “Is political power distributed according to gender?” measured on a scale of 0 (men have monopoly) to 4 (roughly equal).
Equality of political power by socioeconomic position (ordinal): Country experts’ aggregated evaluation of the question, “Is political power distributed according to socioeconomic position?” measured on a scale of 0 (wealthy monopoly) to 4 (roughly equal).
Equality of political power by social group (ordinal): Country experts’ aggregated evaluation of the question, “Is political power distributed according to social groups (defined by caste, ethnicity, language, race, religion or some combination thereof)?” measured on a scale of 0 (monopolized by a social group that’s a minority of the pop.) to 4 (roughly equal).
Access to Advanced Education (numeric): Average of its components.
Quality weighted universities (numeric): The number of universities in a country weighted by the quality of universities, measured by university rankings.
Expected years of tertiary schooling (numeric): Number of years a person of tertiary school entrance age can expect to spend within tertiary education.
Women with advanced education (numeric): Proportion of females (age-standardized) with 12–18 years of education.
Citable documents (numeric): Citable documents - articles, reviews, and conference papers - per 1,000 population.
Academic freedom (ordinal): Aggregated evaluation of the question, “To what extent is academic freedom respected?”, measured on a scale of 0 to 1.
SPI <- read_excel("/Users/karenkimble/Documents/R Practice/Social Progress Index.xlsx", sheet = "2011-2021 data")
SPI$...10 <- NULL
SPI$...23 <- NULL
colnames(SPI) <- c("Rank",
"Country",
"Code",
"Year",
"Status",
"SPI",
"Needs",
"Wellbeing",
"Opportunity",
"Nutrition/care",
"Sanitation",
"Shelter",
"Safety",
"Access-knowledge",
"Info-comm",
"Health",
"Environment",
"Rights",
"Choice",
"Inclusiveness",
"Advanced-ed",
"Infectious",
"Child mortality",
"Stunting",
"Maternal-mortality",
"Undernourishment",
"Improved-sanitation",
"Improved-water",
"Hygeine-deaths",
"Pollution-deaths",
"Housing",
"Electricity",
"Clean-fuels",
"Personal-violence-deaths",
"Transport",
"Criminality",
"Political-killings",
"Women-no-education",
"Education-access",
"Primary-enrollment",
"Secondary-attainment",
"Gender-gap-secondary",
"Online-governance",
"Internet-users",
"Media",
"Cellphone",
"Life-expectancy",
"Premature-deaths",
"Healthcare",
"Essential-services",
"Pollution",
"Lead",
"Particulate",
"Species",
"Justice",
"Expression",
"Religion",
"Political-rights",
"Property",
"Contraception",
"Corruption",
"Early-marriage",
"Youth-nonemployed",
"Vulnerable",
"Equal-gender",
"Equal-social",
"Equal-socioeconomic",
"Discrimination-violence",
"LGBT",
"Citable-docs",
"Academic",
"Women-advanced",
"Tertiary",
"Quality-unis")
head(SPI)
# A tibble: 6 × 74
Rank Country Code Year Status SPI Needs Wellbeing Opportunity
<dbl> <chr> <chr> <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
1 NA World WWW 2021 <NA> 65.1 74.2 64.4 56.5
2 NA World WWW 2020 <NA> 64.7 73.8 64.5 55.8
3 NA World WWW 2019 <NA> 64.7 73.3 64.5 56.1
4 NA World WWW 2018 <NA> 64.0 73.0 63.2 55.8
5 NA World WWW 2017 <NA> 63.7 72.6 62.4 55.9
6 NA World WWW 2016 <NA> 63.1 72.1 61.5 55.8
# … with 65 more variables: `Nutrition/care` <dbl>, Sanitation <dbl>,
# Shelter <dbl>, Safety <dbl>, `Access-knowledge` <dbl>,
# `Info-comm` <dbl>, Health <dbl>, Environment <dbl>, Rights <dbl>,
# Choice <dbl>, Inclusiveness <dbl>, `Advanced-ed` <dbl>,
# Infectious <dbl>, `Child mortality` <dbl>, Stunting <dbl>,
# `Maternal-mortality` <dbl>, Undernourishment <dbl>,
# `Improved-sanitation` <dbl>, `Improved-water` <dbl>, …
As you can see, there are a large number of variables with different indicators for society. For the purposes of my final paper, I will primarily be focusing on the main indicators of each section: Nutrition and Basic Medical Care, Water and Sanitation, Shelter, Personal Safety, Access to Knowledge, Access to Info/Communications, Health and Wellness, Environmental Quality, Personal Rights, Personal Freedom/Choice, Inclusiveness, and Access to Advanced Education.
Because of the sheer amount of variables within this dataset, I will only be focusing on one category of the SPI’s three major categories: Foundations of Wellbeing. The other two categories, Basic Needs and Opportunity, are still important and should be analyzed. However, I am primarily interested in the Foundations of Wellbeing category, which includes indicators related to access to knowledge and infrastructure as well as health, because it may be interesting to see if countries generally viewed as more “free” and democratic will do well in those categories (such as the United States or some European Union countries). There are still a lot of variables condensed into the Foundations of Wellbeing category, so I will analyze the main variables that are computed using their sub-categories. Those variables are Access to Basic Knowledge, Access to Information and Communications, Health and Wellness, and Environmental Quality.
Access to Basic Knowledge, as shown above, is made up of many variables related to the quality of education, educational attainment, and equal access to education. The Health and Wellness category consists of life expectancy, death rate, and access to healthcare or other services. Lastly, Environmental Quality is based on pollution levels, species protection, and lead exposure deaths.
Have the average worldwide scores for the Foundations of Wellbeing categories improved over time? What categories have improved the most or the least? What about overall Wellbeing?
How do the largest countries from each continent compare when it comes to Wellbeing?
Do countries that have higher Foundations of Wellbeing scores have higher scores in the other major categories? How do those scores relate to rank?
I want to look at the difference between average scores for the categories in 2011 and 2021 to see if there are any changes over that period.
#2011
summary(SPI_2011$Wellbeing)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
31.25 46.41 59.12 59.78 71.14 89.06 34
sd(SPI_2011$Wellbeing, na.rm=TRUE)
[1] 16.17586
#2021
summary(SPI_2021$Wellbeing)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
34.17 55.56 67.45 67.83 78.70 93.80 33
sd(SPI_2021$Wellbeing, na.rm=TRUE)
[1] 15.31456
In both 2011 and 2021, the median and mean Wellbeing scores are the same, showing that the data is not skewed very much. There was an overall improvement in Wellbeing, but not all countries improved the same amount since the standard deviation decreased in 2021. The minimum scores also did not increase as much as the median and mean scores, only improving about 3 points compared to the median’s improvement of 8. The maximum scores also did not improve much.
#2011
summary(SPI_2011$'Access-knowledge')
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
15.27 54.26 75.25 70.97 89.86 98.93 34
sd(SPI_2011$'Access-knowledge', na.rm=TRUE)
[1] 22.02836
#2021
summary(SPI_2021$'Access-knowledge')
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
23.14 61.66 79.32 74.93 91.14 99.51 33
sd(SPI_2021$'Access-knowledge', na.rm=TRUE)
[1] 19.34884
The summary statistics show that Access to Basic Knowledge was much more varied in 2011 than in 2021 because the standard deviation was higher in 2011. Additionally, the minimum score increased by 8 between 2011 and 2021, meaning that countries that have worse Access to Knowledge scores still had an increase since 2011.
#2011
summary(SPI_2011$'Info-comm')
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
0.28 24.45 43.27 44.09 60.05 90.30 33
sd(SPI_2011$'Info-comm', na.rm=TRUE)
[1] 22.40024
#2021
summary(SPI_2021$'Info-comm')
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
5.67 51.07 70.28 66.49 82.52 98.18 32
sd(SPI_2021$'Info-comm', na.rm=TRUE)
[1] 20.74732
In the Access to Information and Communications category, scores also improved overall worldwide between 2011 and 2021. The median score increased dramatically, from 43 in 2011 to 70 in 2021. However, minimum scores did not increase the same amount (from 0.28 in 2011 and 6 in 2021), showing that some countries (or an outlier) were lagging behind. It might be an outlier since the first quartile is 51. The mean increased dramatically as well, from 44 to 66. The standard deviation decreased, however, showing that overall countries were closer together in scores in 2021 than in 2011.
#2011
summary(SPI_2011$Health)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
15.19 45.06 59.26 58.81 70.03 89.20 32
sd(SPI_2011$Health, na.rm=TRUE)
[1] 16.7191
#2021
summary(SPI_2021$Health)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
21.03 49.54 62.61 62.28 73.29 92.10 31
sd(SPI_2021$Health, na.rm=TRUE)
[1] 16.01574
For the Health and Wellness score, the median increased from 2011 to 2021 from 59 to 62. The mean also increased about the same amount. The minimum and maximum scores also increased, showing that countries improved worldwide. The standard deviation stayed the same, showing that countries’ scores distribution stayed the same.
#2011
summary(SPI_2011$Environment)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
22.97 56.66 66.74 65.14 73.70 93.42 30
sd(SPI_2011$Environment, na.rm=TRUE)
[1] 13.6848
#2021
summary(SPI_2021$Environment)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
23.95 58.66 67.83 67.40 77.78 95.15 29
sd(SPI_2021$Environment, na.rm=TRUE)
[1] 14.16101
In the Environmental Quality category, countries overall improved the least. The median stayed about the same, only improving about a point. The mean also stayed the same and was the same as the median for both years. The minimums and maximums were the same, too. Out of all the categories, the environment saw the least improvement, if there was any at all.
By looking at how the world is doing as a whole from 2011-2021, we can get an idea of what the improvement overall has been like and compare that to individual countries’ progress. These graphs have been changed and now show the standard error bars.
avgAK <- summarySE(SPI, measurevar="Access-knowledge", groupvars=c("Year"),
na.rm=TRUE)
avgAK %>%
ggplot(aes(x=Year, y=`Access-knowledge`)) +
geom_errorbar(aes(ymin=`Access-knowledge`-se, ymax=`Access-knowledge`+se),
width=.1, color="blue") +
geom_line(color="dark blue") +
geom_point(color="dark blue") +
labs(y="Avg Access to Knowledge")
avgIC <- summarySE(SPI, measurevar="Info-comm", groupvars=c("Year"), na.rm=TRUE)
avgIC %>%
ggplot(aes(x=Year, y=`Info-comm`)) +
geom_errorbar(aes(ymin=`Info-comm`-se, ymax=`Info-comm`+se), width=.1,
color="red") +
geom_line(color="dark red") +
geom_point(color="dark red") +
labs(y="Avg Info and Communications")
avgHW <- summarySE(SPI, measurevar="Health", groupvars=c("Year"), na.rm=TRUE)
avgHW %>%
ggplot(aes(x=Year, y=`Health`)) +
geom_errorbar(aes(ymin=`Health`-se, ymax=`Health`+se), width=.1,
color="#b4a7d6") +
geom_line(color="#4d1c7c") +
geom_point(color="#4d1c7c") +
labs(y="Avg Health and Wellness")
avgEQ <- summarySE(SPI, measurevar="Environment", groupvars=c("Year"),
na.rm=TRUE)
avgEQ %>%
ggplot(aes(x=Year, y=`Environment`)) +
geom_errorbar(aes(ymin=`Environment`-se, ymax=`Environment`+se), width=.1,
color="#93c47d") +
geom_line(color="#274e13") +
geom_point(color="#274e13") +
labs(y="Avg Environemtnal Quality")
All of these plots show that there has been improvement across all categories, but not all of them have been consistent and they have all been exponential. Something left out is how each country has improved over the years. I also could have chosen a different metric, such as a median, which can give a different type of insight since means may be skewed due to outliers. Additionally, there aren’t a lot of years included in the dataset compared to the length of human history, so some more historical data could be valuable.
Since there are a great many countries in the dataset and I don’t want there to be an overcrowded graph, I will select a few countries to look at. I’ll base my selection on the largest countries by population in their respective continent so there is some similarity between them: China, Russia, the United States, Brazil, Nigeria, and Australia.
SPI_Large <- SPI %>%
filter(`Country` %in% c("China", "Russia", "Brazil", "Nigeria", "Australia",
"United States"))
head(SPI_Large)
# A tibble: 6 × 74
Rank Country Code Year Status SPI Needs Wellbeing Opportunity
<dbl> <chr> <chr> <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
1 11 Australia AUS 2021 Ranked 90.3 95.1 90.4 85.3
2 10 Australia AUS 2020 Ranked 90.1 95.1 90.5 84.8
3 12 Australia AUS 2019 Ranked 90 95 90.2 84.8
4 11 Australia AUS 2018 Ranked 89.9 94.6 90.6 84.5
5 10 Australia AUS 2017 Ranked 90.0 95.1 90.2 84.6
6 10 Australia AUS 2016 Ranked 89.8 95.2 89.9 84.5
# … with 65 more variables: `Nutrition/care` <dbl>, Sanitation <dbl>,
# Shelter <dbl>, Safety <dbl>, `Access-knowledge` <dbl>,
# `Info-comm` <dbl>, Health <dbl>, Environment <dbl>, Rights <dbl>,
# Choice <dbl>, Inclusiveness <dbl>, `Advanced-ed` <dbl>,
# Infectious <dbl>, `Child mortality` <dbl>, Stunting <dbl>,
# `Maternal-mortality` <dbl>, Undernourishment <dbl>,
# `Improved-sanitation` <dbl>, `Improved-water` <dbl>, …
By looking at overall rankings over time, there can be a good general idea of how these countries have done in comparison to the others in all indicators, not just a few.
(it is important to note that a low rank means the country is doing better than the others and a higher number means it is doing worse)
ggplot(data = SPI_Large, mapping=aes(x = `Year`, y = `Rank`, color = `Country`)) +
geom_line() +
facet_wrap(facets = vars(`Country`))
From the above, we can see that Nigeria has consistently ranked very poorly with very little improvement. Brazil had a slightly better-than-middle ranking, but then was suddenly ranked worse in 2017 and continued to trend poorer every year since. China and Russia, on the other hand, seem pretty stagnant with consistent rankings throughout the years–China doing worse than Russia. Australia has the best consistent rankings out of all the countries, while the US was a close second but has started to be ranked poorly in 2015 or so and on. I think it’s interesting to look at these comparisons when thinking about overall rankings because it makes me wonder what is dragging down or boosting up scores for each country. Something left unanswered is what other countries in the same continent are like for rankings, what caused these rankings to drop, and what categories some countries do better in than others. A general view is helpful but does not tell everything.
ggplot(data = SPI_Large, mapping=aes(x = `Year`, y = `Wellbeing`, color = `Country`)) +
geom_line() +
facet_wrap(facets = vars(`Country`))
The country that has consistently done the best in Wellbeing is Australia, with scores near 90 for the entire duration. Nigeria, on the other hand, has done worse than the other countries but has improved since the beginning of the SPI data. Russia and Brazil have stayed towards the middle of the scores, though Russia stagnated towards the end and Brazil had a slight decrease. China has improved and rose from the bottom scores to the middle. The United States has also had slight improvement while staying towards high scores, though Australia has done better overall.
ggplot(data = SPI_Large, mapping=aes(x = `Year`, y = `Needs`, color = `Country`)) +
geom_line() +
facet_wrap(facets = vars(`Country`))
This graph of the Basic Needs category almost mirrors the graph of the Wellbeing category, with some exceptions. Russia is much more stagnant, with slightly higher scores than in Wellbeing. China also started out much higher than it did for Wellbeing, and saw less improvement over time but probably because it had a good starting point. Nigeria’s Basic Needs scores mirrored its Wellbeing scores, towards the law end with not very much improvement. Brazil’s was very different because its Basic Needs score stayed stagnant while its Wellbeing score fluctuated more throughout the duration. Australia was also the same, staying stagnant with higher scores. The United States’ Basic Needs scores were similar to its Wellbeing scores, but had a drop between 2016 and 2019 not present in Wellbeing.
ggplot(data = SPI_Large, mapping=aes(x = `Year`, y = `Opportunity`, color = `Country`)) +
geom_line() +
facet_wrap(facets = vars(`Country`))
The graph for Opportunity is different from the Wellbeing and Basic Needs graphs. Australia, unlike the other countries, had relatively the same scores for all three categories and had consistently high scores from 2011-2021. Nigeria’s score was also somewhat the same, remaining low throughout 2011-2021, but did not have as much of an increase as the Wellbeing or Basic Needs scores. Russia had consistently low Opportunity scores, unlike its Wellbeing scores (which increased) and its Basic Needs scores (which were stagnated but higher). China also had very consistently low Opportunity scores while its Wellbeing scores started low but increased dramatically and its Basic Needs scores were relatively high. Brazil was very different from its other two scores. For Wellbeing, Brazil increased, and for Basic Needs, the country was stagnant but relatively high. Its Opportunity scores, however, Brazil stayed stagnant from 2011-2016 then had a sudden dramatic drop. Lastly, the United States had pretty similar scores to its other two graphs, staying high with a slight drop at the end.
This class was not my first time using R, but it was my first time using the software so in-depth since in my previous class it was not the main focus. I decided to focus on Wellbeing specifically because it is not a category that seems to be a priority for the United States (as well as many other countries). I really wanted to see if Wellbeing mattered in overall rankings and whether there has been any improvement in the U.S. and worldwide in that area.
I think this dataset in general was very challenging because of its size and how many variables were included. There are so many aspects of the SPI dataset that I simply could not look at or analyze–if given more time and no homework in any other class I would gladly do that. Another huge challenge was making sure that graphs were readable and that I had picked the right wording and format for that variable when writing the code. Learning ggplot and ggplot2 was somewhat difficult but once I passed a certain point of understanding it came more easily.
Something that I would like to do next with this project if I were to continue would be to perform significance tests on whether or not Wellbeing influences country rankings. Also, I would like to further analyze the variables related to gender equality and see how those relate to country rankings.
Comparing the graphs of Basic Needs, Wellbeing, and Opportunity to the original graph of Rankings from 2011 to 2021 for Large Countries can show some interesting results. Australia, with its consistent high scores in all three categories, also had consistently high rankings.
Brazil’s rankings started off stagnant in the high-middle, then dropped dramatically after 2016. This seems to be mostly caused by the sharp drop in the Opportunity scores, as well as in a less-sharp drop in the Wellbeing Score after 2016. Its Basic Needs score stayed the same throughout the period, but this clearly did not impact the rankings as much as the other scores.
China’s rankings, on the other hand, seem to be most informed by its Basic Needs and Opportunity scores–since all three stayed pretty stagnant. However, while Basic Needs did not fluctuate much within the high-middle, Opportunity scores stayed relatively the same on the very low end. China’s stagnant low rankings seem to be informed by this, rather than its Wellbeing score, which started low but had a dramatic increase towards the middle after 2013.
Nigeria had consistently low rankings yet had increases in both Wellbeing and Basic Needs. However, its Opportunity score stayed very low and had a slight decrease after 2019. The progress Nigeria made in the other categories must not have been enough to outweigh the low Opportunity score or possibly greater progress other countries made that pushed its rankings low.
Russia’s rankings fluctuated slightly from 2011 to 2021, but remained in the upper-middle. This is probably due to its Basic Needs and Opportunity scores. Its Basic Needs scores stayed in the upper-middle while the Opportunity scores stayed in the lower middle, somewhat canceling each other out. Its Wellbeing scores, on the other hand, saw a pretty dramatic rise after 2013 but this was clearly not enough to improve its overall rankings.
The United States’ rankings can also be seen in its scores. For Wellbeing, the U.S. had pretty consistent high scores. For Basic Human Needs and Opportunity, however, the country’s scores decreased after 2016. The rankings reflect this, showing still high but relatively lower rankings after 2016 than in 2011.
Overall, it seems that Wellbeing scores are not the best indicator of what a country’s SPI ranking would be. Some countries made a relatively large amount of progress in that category, yet still had low rankings. Additionally, overall worldwide rankings for the Wellbeng categories increased–though Environmental Quality had the least progress out of them. However, it is unclear if these conclusions can be generalized to all other countries instead of just large countries. It’s also unclear what smaller countries’ rankings and scores look like, as well as countries in each continent and each socioeconomic class or GDP ranking.
Data taken from: https://www.socialprogress.org
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Kimble (2022, May 11). Data Analytics and Computational Social Science: DACSS 601 Final. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomkkimble898672/
BibTeX citation
@misc{kimble2022dacss, author = {Kimble, Karen}, title = {Data Analytics and Computational Social Science: DACSS 601 Final}, url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomkkimble898672/}, year = {2022} }