ATP Pew Data
Pew Research conducts polls across the world on a wide range of topics, from education and digital/social media to US and international politics. I have selected a poll conducted in February 2021 among Americans, that asks for their opinions on wide variety of global issues. Pew data are available for public use in SPSS format.
Read in SPSS data set:
I learned that ‘haven’ is library that can help. SPSS allows user-defined missing values. The haven package seems to import the variables in a custom format called “labelled” and leaving the user to manage the numeric codes or value labels needed in R. And that user_na should be set to true so that user defined missing variable will be read into ‘labelled_spss()’ objects.
head(atp)
# A tibble: 6 × 147
QKEY INTERVIEW_START_W82 INTERVIEW_END_W82 DEVICE_TYPE_W82
<dbl> <dttm> <dttm> <fct>
1 101224 2021-02-02 12:07:23 2021-02-02 12:22:48 Smartphone
2 101437 2021-02-02 20:20:07 2021-02-02 20:41:12 Laptop/PC
3 102198 2021-02-06 16:42:09 2021-02-06 17:03:57 Laptop/PC
4 103094 2021-02-03 15:45:54 2021-02-03 16:06:14 Smartphone
5 104368 2021-02-02 22:42:41 2021-02-03 00:05:33 Laptop/PC
6 104689 2021-02-02 19:44:43 2021-02-02 20:01:34 Smartphone
# … with 143 more variables: LANG_W82 <fct>, FORM_W82 <fct>,
# GAP21Q1_W82 <fct>, GAP21Q2_W82 <fct>, GAP21Q3_W82 <fct>,
# GAP21Q4_a_W82 <fct>, GAP21Q4_b_W82 <fct>, GAP21Q4_c_W82 <fct>,
# GAP21Q4_d_W82 <fct>, GAP21Q4_e_W82 <fct>, GAP21Q4_f_W82 <fct>,
# THERMCHINA_W82 <fct>, THERMINDIA_W82 <fct>, THERMJAPAN_W82 <fct>,
# THERMNKOREA_W82 <fct>, GAP21Q5_a_W82 <fct>, GAP21Q5_b_W82 <fct>,
# GAP21Q6_W82 <fct>, GAP21Q7_a_W82 <fct>, GAP21Q7_b_W82 <fct>, …
After reviewing the survey questions, I’ve decided to work with handful of variables concentrated around US and its relationship to other countries; especially in light of Russian aggression towards Ukraine and the various arguments for/ against intervening in conflicts like Ukraine. I plan to look at responses by a couple of self-reported (nominal) variables: education level and political political leaning: perhaps other variables too. New dataframe name:
atp_selected <- atp %>%
select(GAP21Q3_W82,
GAP21Q6_W82,
GAP21Q19_a_W82,
GAP21Q19_b_W82,
GAP21Q19_c_W82,
GAP21Q19_d_W82,
GAP21Q19_e_W82,
GAP21Q29_W82,
GAP21Q33_a_W82,
GAP21Q33_b_W82,
GAP21Q33_c_W82,
GAP21Q33_d_W82,
GAP21Q33_e_W82,
GAP21Q33_f_W82,
GAP21Q33_g_W82,
GAP21Q33_h_W82,
GAP21Q33_i_W82,
GAP21Q33_j_W82,
GAP21Q35_W82,
GAP21Q36_W82,
GAP21Q37_W82,
F_EDUCCAT,
F_PARTYSUMIDEO)
Here is a list of the included variables:
colnames(atp_selected)
[1] "GAP21Q3_W82" "GAP21Q6_W82" "GAP21Q19_a_W82"
[4] "GAP21Q19_b_W82" "GAP21Q19_c_W82" "GAP21Q19_d_W82"
[7] "GAP21Q19_e_W82" "GAP21Q29_W82" "GAP21Q33_a_W82"
[10] "GAP21Q33_b_W82" "GAP21Q33_c_W82" "GAP21Q33_d_W82"
[13] "GAP21Q33_e_W82" "GAP21Q33_f_W82" "GAP21Q33_g_W82"
[16] "GAP21Q33_h_W82" "GAP21Q33_i_W82" "GAP21Q33_j_W82"
[19] "GAP21Q35_W82" "GAP21Q36_W82" "GAP21Q37_W82"
[22] "F_EDUCCAT" "F_PARTYSUMIDEO"
Renaming some key factor variables:
class(atp_selected)
[1] "tbl_df" "tbl" "data.frame"
##Q33 asks Americans to weigh in on various US priorities - outward facing and inward facing. Variables renamed:
atp_selected <- atp_selected %>%
rename(Reduce_WeaponsMD = GAP21Q33_a_W82,
American_Jobs = GAP21Q33_b_W82,
Strengthen_UN = GAP21Q33_c_W82,
Reduce_USMil_Overseas = GAP21Q33_d_W82,
Limit_Russia_Power = GAP21Q33_e_W82,
Promote_Democracy_Overseas = GAP21Q33_f_W82,
Reduce_Illegal_Immigr = GAP21Q33_g_W82,
Limit_China_Power = GAP21Q33_h_W82,
Maintain_USMil_Advantage = GAP21Q33_i_W82,
Global_Climate_Change = GAP21Q33_j_W82)
Wrangling Data:
Q19 - Pivot Longer; create a new data frame “Leadership” where Confidence_in_world_leaders is a new column name
Leadership <- atp_selected %>% select(US_Biden_Confidence, Chinese_Jinping_Confidence, Russian_Putin_Confidence, German_Merkel_Confidence, French_Macron_Confidence)
Leadership <- pivot_longer(Leadership, cols=c("US_Biden_Confidence",
"Chinese_Jinping_Confidence", "Russian_Putin_Confidence", "German_Merkel_Confidence", "French_Macron_Confidence"), names_to = "Confidence_in_world_leaders", values_to = "Level")
Leadership_table <- as.data.frame(table(Leadership))
head(Leadership_table)
Confidence_in_world_leaders Level Freq
1 Chinese_Jinping_Confidence A lot of confidence 29
2 French_Macron_Confidence A lot of confidence 259
3 German_Merkel_Confidence A lot of confidence 592
4 Russian_Putin_Confidence A lot of confidence 38
5 US_Biden_Confidence A lot of confidence 859
6 Chinese_Jinping_Confidence Some confidence 296
Leadership <- Leadership %>%
group_by(`Confidence_in_world_leaders`, Level) %>%
summarise(Freq = n()) %>%
mutate(percentage = formattable::percent(Freq / sum(Freq)))
head(Leadership)
# A tibble: 6 × 4
# Groups: Confidence_in_world_leaders [2]
Confidence_in_world_leaders Level Freq percentage
<chr> <fct> <int> <formttbl>
1 Chinese_Jinping_Confidence A lot of confidence 29 1.12%
2 Chinese_Jinping_Confidence Some confidence 296 11.40%
3 Chinese_Jinping_Confidence Not too much confidence 1043 40.18%
4 Chinese_Jinping_Confidence No confidence at all 1170 45.07%
5 Chinese_Jinping_Confidence Refused 58 2.23%
6 French_Macron_Confidence A lot of confidence 259 9.98%
NEED TO to pull out the “Refused” responses eventually. ###
polit_lead <- atp_selected %>%
select(US_Biden_Confidence, Chinese_Jinping_Confidence, Russian_Putin_Confidence, German_Merkel_Confidence, French_Macron_Confidence, F_PARTYSUMIDEO)
polit_lead <- pivot_longer(polit_lead, cols = c(US_Biden_Confidence, Chinese_Jinping_Confidence, Russian_Putin_Confidence, German_Merkel_Confidence, French_Macron_Confidence), names_to = "confidence_in_world_leaders", values_to = "level")
head(polit_lead)
# A tibble: 6 × 3
F_PARTYSUMIDEO confidence_in_world_leaders level
<fct> <chr> <fct>
1 Conservative Rep/Lean US_Biden_Confidence Not too much conf…
2 Conservative Rep/Lean Chinese_Jinping_Confidence No confidence at …
3 Conservative Rep/Lean Russian_Putin_Confidence No confidence at …
4 Conservative Rep/Lean German_Merkel_Confidence Not too much conf…
5 Conservative Rep/Lean French_Macron_Confidence Not too much conf…
6 Conservative Rep/Lean US_Biden_Confidence No confidence at …
Now, we group variables by world leaders and polit orientation. New dataframe:
polit_lead_group <- polit_lead %>%
group_by(confidence_in_world_leaders, F_PARTYSUMIDEO, level) %>%
summarise(Freq = n()) %>%
mutate(percentage = formattable::percent(Freq / sum(Freq)))
head(polit_lead_group)
# A tibble: 6 × 5
# Groups: confidence_in_world_leaders, F_PARTYSUMIDEO [2]
confidence_in_world_leaders F_PARTYSUMIDEO level Freq percentage
<chr> <fct> <fct> <int> <formttbl>
1 Chinese_Jinping_Confidence Conservative Rep… A lo… 4 0.57%
2 Chinese_Jinping_Confidence Conservative Rep… Some… 39 5.60%
3 Chinese_Jinping_Confidence Conservative Rep… Not … 178 25.57%
4 Chinese_Jinping_Confidence Conservative Rep… No c… 462 66.38%
5 Chinese_Jinping_Confidence Conservative Rep… Refu… 13 1.87%
6 Chinese_Jinping_Confidence Moderate/Liberal… A lo… 3 0.76%
US_Priorities <- atp_selected %>%
select(Reduce_WeaponsMD,American_Jobs,Strengthen_UN,Reduce_USMil_Overseas, Limit_Russia_Power, Promote_Democracy_Overseas, Reduce_Illegal_Immigr, Limit_China_Power, Maintain_USMil_Advantage, Global_Climate_Change)
US_Priorities <- pivot_longer(US_Priorities, cols = c(Reduce_WeaponsMD,American_Jobs,Strengthen_UN,Reduce_USMil_Overseas, Limit_Russia_Power, Promote_Democracy_Overseas, Reduce_Illegal_Immigr, Limit_China_Power, Maintain_USMil_Advantage, Global_Climate_Change), names_to = "FX_Policy_goals", values_to ="Level")
head(US_Priorities)
# A tibble: 6 × 2
FX_Policy_goals Level
<chr> <fct>
1 Reduce_WeaponsMD <NA>
2 American_Jobs <NA>
3 Strengthen_UN <NA>
4 Reduce_USMil_Overseas <NA>
5 Limit_Russia_Power <NA>
6 Promote_Democracy_Overseas <NA>
US Priorities by Political Leaning
Prior_polit_lean <- atp_selected %>%
select(Reduce_WeaponsMD,American_Jobs,Strengthen_UN,Reduce_USMil_Overseas, Limit_Russia_Power, Promote_Democracy_Overseas, Reduce_Illegal_Immigr, Limit_China_Power, Maintain_USMil_Advantage, Global_Climate_Change, F_PARTYSUMIDEO)
Prior_polit_lean <- pivot_longer(Prior_polit_lean, cols = c(Reduce_WeaponsMD,American_Jobs,Strengthen_UN,Reduce_USMil_Overseas, Limit_Russia_Power, Promote_Democracy_Overseas, Reduce_Illegal_Immigr, Limit_China_Power, Maintain_USMil_Advantage, Global_Climate_Change), names_to = "FX_Policy-goals", values_to = "level")
head(Prior_polit_lean)
# A tibble: 6 × 3
F_PARTYSUMIDEO `FX_Policy-goals` level
<fct> <chr> <fct>
1 Conservative Rep/Lean Reduce_WeaponsMD <NA>
2 Conservative Rep/Lean American_Jobs <NA>
3 Conservative Rep/Lean Strengthen_UN <NA>
4 Conservative Rep/Lean Reduce_USMil_Overseas <NA>
5 Conservative Rep/Lean Limit_Russia_Power <NA>
6 Conservative Rep/Lean Promote_Democracy_Overseas <NA>
Adds a new column that calculates percentage for every group [After summarize assumes percentages for every group; or mutate(percent = freq/sum(freq))]
Prior_polit_lean_sum <- Prior_polit_lean %>%
group_by(`FX_Policy-goals`, F_PARTYSUMIDEO, level) %>%
summarise(freq = n()) %>%
filter(!is.na(level)) %>%
mutate(percentage = formattable::percent(freq / sum(freq)))
head(Prior_polit_lean_sum)
# A tibble: 6 × 5
# Groups: FX_Policy-goals, F_PARTYSUMIDEO [2]
`FX_Policy-goals` F_PARTYSUMIDEO level freq percentage
<chr> <fct> <fct> <int> <formttbl>
1 American_Jobs Conservative Rep/Lean Top pr… 326 90.30%
2 American_Jobs Conservative Rep/Lean Some p… 29 8.03%
3 American_Jobs Conservative Rep/Lean No pri… 4 1.11%
4 American_Jobs Conservative Rep/Lean Refused 2 0.55%
5 American_Jobs Moderate/Liberal Rep/Lean Top pr… 160 83.77%
6 American_Jobs Moderate/Liberal Rep/Lean Some p… 26 13.61%
NB: filter(F_PARTYSUMIDEO %in% c(“Conservative Rep/Lean”,“Liberal Dem/Lean”)) %>%
Shorthand: Adding a new column using$ Prior_polit_lean_sumF$Percentage <-
This data looks good, need to visualize. First, we want to show as a percentage -
US_cooperation_polit_lean <- atp_selected %>%
select(Intl_Collaboration, F_PARTYSUMIDEO) %>%
filter(Intl_Collaboration %in% c("MANY of the problems facing our country can be solved by working with other countries","FEW of the problems facing our country can be solved by working with other countries")) %>%
group_by(`Intl_Collaboration`, F_PARTYSUMIDEO)%>%
summarise(freq = n()) %>%
mutate(percentage = formattable::percent(freq / sum(freq)))
head(US_cooperation_polit_lean)
# A tibble: 6 × 4
# Groups: Intl_Collaboration [2]
Intl_Collaboration F_PARTYSUMIDEO freq percentage
<fct> <fct> <int> <formttbl>
1 MANY of the problems facing our cou… Conservative … 188 13.19%
2 MANY of the problems facing our cou… Moderate/Libe… 156 10.95%
3 MANY of the problems facing our cou… Moderate/Cons… 483 33.89%
4 MANY of the problems facing our cou… Liberal Dem/L… 543 38.11%
5 MANY of the problems facing our cou… Refused eithe… 55 3.86%
6 FEW of the problems facing our coun… Conservative … 501 43.95%
intl <- atp_selected %>%
select(c(9:19))
intl <- pivot_longer(intl, cols = c(Reduce_WeaponsMD,American_Jobs,Strengthen_UN,
Reduce_USMil_Overseas, Limit_Russia_Power, Promote_Democracy_Overseas,
Reduce_Illegal_Immigr, Limit_China_Power, Maintain_USMil_Advantage,
Global_Climate_Change), names_to = "FX_Policy-goals", values_to ="Level") %>% group_by(`FX_Policy-goals`,`Intl_Collaboration`) %>%
filter(Intl_Collaboration %in% c("MANY of the problems facing our country can be solved by working with other countries","FEW of the problems facing our country can be solved by working with other countries")) %>%
summarise(freq = n()) %>%
mutate(percentage = formattable::percent(freq / sum(freq)))
head(intl)
# A tibble: 6 × 4
# Groups: FX_Policy-goals [3]
`FX_Policy-goals` Intl_Collaboration freq percentage
<chr> <fct> <int> <formttbl>
1 American_Jobs MANY of the problems facing … 1425 55.56%
2 American_Jobs FEW of the problems facing o… 1140 44.44%
3 Global_Climate_Change MANY of the problems facing … 1425 55.56%
4 Global_Climate_Change FEW of the problems facing o… 1140 44.44%
5 Limit_China_Power MANY of the problems facing … 1425 55.56%
6 Limit_China_Power FEW of the problems facing o… 1140 44.44%
Potential Research Questions
With these and possibly a few more questions, I want to address the following research questions:
What do Americans expect from their government in terms of world leadership?
Should we be actively involving ourselves in world affairs, or more focused on problems at home?
What should US policy priorities be focused on - more outward facing or inward facing issues?
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Garibian (2022, March 23). Data Analytics and Computational Social Science: Homework_3. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomlenna717881001/
BibTeX citation
@misc{garibian2022homework_3, author = {Garibian, Lenna}, title = {Data Analytics and Computational Social Science: Homework_3}, url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomlenna717881001/}, year = {2022} }