Data Analytics and Computational Social Science: Homework_3

Lenna Garibian

Pew Research conducts polls across the world on a wide range of topics, from education and digital/social media to US and international politics. I have selected a poll conducted in February 2021 among Americans, that asks for their opinions on wide variety of global issues. Pew data are available for public use in SPSS format.

Read in SPSS data set:
I learned that ‘haven’ is library that can help. SPSS allows user-defined missing values. The haven package seems to import the variables in a custom format called “labelled” and leaving the user to manage the numeric codes or value labels needed in R. And that user_na should be set to true so that user defined missing variable will be read into ‘labelled_spss()’ objects.

atp <- read_sav("ATP W82.sav",user_na = TRUE) %>%
         as_factor()
dim(atp)

[1] 2596  147

head(atp)

# A tibble: 6 × 147
    QKEY INTERVIEW_START_W82 INTERVIEW_END_W82   DEVICE_TYPE_W82
   <dbl> <dttm>              <dttm>              <fct>          
1 101224 2021-02-02 12:07:23 2021-02-02 12:22:48 Smartphone     
2 101437 2021-02-02 20:20:07 2021-02-02 20:41:12 Laptop/PC      
3 102198 2021-02-06 16:42:09 2021-02-06 17:03:57 Laptop/PC      
4 103094 2021-02-03 15:45:54 2021-02-03 16:06:14 Smartphone     
5 104368 2021-02-02 22:42:41 2021-02-03 00:05:33 Laptop/PC      
6 104689 2021-02-02 19:44:43 2021-02-02 20:01:34 Smartphone     
# … with 143 more variables: LANG_W82 <fct>, FORM_W82 <fct>,
#   GAP21Q1_W82 <fct>, GAP21Q2_W82 <fct>, GAP21Q3_W82 <fct>,
#   GAP21Q4_a_W82 <fct>, GAP21Q4_b_W82 <fct>, GAP21Q4_c_W82 <fct>,
#   GAP21Q4_d_W82 <fct>, GAP21Q4_e_W82 <fct>, GAP21Q4_f_W82 <fct>,
#   THERMCHINA_W82 <fct>, THERMINDIA_W82 <fct>, THERMJAPAN_W82 <fct>,
#   THERMNKOREA_W82 <fct>, GAP21Q5_a_W82 <fct>, GAP21Q5_b_W82 <fct>,
#   GAP21Q6_W82 <fct>, GAP21Q7_a_W82 <fct>, GAP21Q7_b_W82 <fct>, …

After reviewing the survey questions, I’ve decided to work with handful of variables concentrated around US and its relationship to other countries; especially in light of Russian aggression towards Ukraine and the various arguments for/ against intervening in conflicts like Ukraine. I plan to look at responses by a couple of self-reported (nominal) variables: education level and political political leaning: perhaps other variables too. New dataframe name:

atp_selected <- atp %>%
select(GAP21Q3_W82, 
       GAP21Q6_W82,
       GAP21Q19_a_W82, 
       GAP21Q19_b_W82, 
       GAP21Q19_c_W82, 
       GAP21Q19_d_W82, 
       GAP21Q19_e_W82,
       GAP21Q29_W82, 
      GAP21Q33_a_W82, 
      GAP21Q33_b_W82, 
      GAP21Q33_c_W82, 
      GAP21Q33_d_W82, 
      GAP21Q33_e_W82, 
      GAP21Q33_f_W82, 
      GAP21Q33_g_W82, 
      GAP21Q33_h_W82, 
      GAP21Q33_i_W82, 
      GAP21Q33_j_W82,
      GAP21Q35_W82, 
      GAP21Q36_W82,
      GAP21Q37_W82,
      F_EDUCCAT,
      F_PARTYSUMIDEO)

Here is a list of the included variables:

colnames(atp_selected)

 [1] "GAP21Q3_W82"    "GAP21Q6_W82"    "GAP21Q19_a_W82"
 [4] "GAP21Q19_b_W82" "GAP21Q19_c_W82" "GAP21Q19_d_W82"
 [7] "GAP21Q19_e_W82" "GAP21Q29_W82"   "GAP21Q33_a_W82"
[10] "GAP21Q33_b_W82" "GAP21Q33_c_W82" "GAP21Q33_d_W82"
[13] "GAP21Q33_e_W82" "GAP21Q33_f_W82" "GAP21Q33_g_W82"
[16] "GAP21Q33_h_W82" "GAP21Q33_i_W82" "GAP21Q33_j_W82"
[19] "GAP21Q35_W82"   "GAP21Q36_W82"   "GAP21Q37_W82"  
[22] "F_EDUCCAT"      "F_PARTYSUMIDEO"

Renaming some key factor variables:

atp_selected <- atp_selected %>%
  rename(US_Democracy = GAP21Q3_W82,
         US_example = GAP21Q6_W82,
         US_leadership = GAP21Q29_W82, 
         Intl_Collaboration = GAP21Q35_W82,
         Home_vs_Abroad = GAP21Q37_W82)

class(atp_selected)

[1] "tbl_df"     "tbl"        "data.frame"

Q19 asks about confidence levels of Americans with various world leaders, including US President Biden. Rename variables:

atp_selected <- atp_selected %>%
  rename(US_Biden_Confidence = GAP21Q19_a_W82, 
         Chinese_Jinping_Confidence = GAP21Q19_b_W82, 
         Russian_Putin_Confidence = GAP21Q19_c_W82, 
         German_Merkel_Confidence = GAP21Q19_d_W82, 
         French_Macron_Confidence = GAP21Q19_e_W82)

##Q33 asks Americans to weigh in on various US priorities - outward facing and inward facing. Variables renamed:

atp_selected <- atp_selected %>%
  rename(Reduce_WeaponsMD = GAP21Q33_a_W82,
         American_Jobs = GAP21Q33_b_W82, 
         Strengthen_UN = GAP21Q33_c_W82, 
         Reduce_USMil_Overseas = GAP21Q33_d_W82, 
         Limit_Russia_Power = GAP21Q33_e_W82, 
         Promote_Democracy_Overseas = GAP21Q33_f_W82, 
         Reduce_Illegal_Immigr = GAP21Q33_g_W82, 
         Limit_China_Power = GAP21Q33_h_W82, 
         Maintain_USMil_Advantage = GAP21Q33_i_W82, 
         Global_Climate_Change = GAP21Q33_j_W82)

Wrangling Data:

Q19 - Pivot Longer; create a new data frame “Leadership” where Confidence_in_world_leaders is a new column name

Leadership <- atp_selected %>% select(US_Biden_Confidence, Chinese_Jinping_Confidence, Russian_Putin_Confidence, German_Merkel_Confidence, French_Macron_Confidence) 
  

Leadership <- pivot_longer(Leadership, cols=c("US_Biden_Confidence", 
                            "Chinese_Jinping_Confidence", "Russian_Putin_Confidence", "German_Merkel_Confidence", "French_Macron_Confidence"), names_to = "Confidence_in_world_leaders", values_to = "Level")

Leadership_table <- as.data.frame(table(Leadership))

head(Leadership_table)

  Confidence_in_world_leaders               Level Freq
1  Chinese_Jinping_Confidence A lot of confidence   29
2    French_Macron_Confidence A lot of confidence  259
3    German_Merkel_Confidence A lot of confidence  592
4    Russian_Putin_Confidence A lot of confidence   38
5         US_Biden_Confidence A lot of confidence  859
6  Chinese_Jinping_Confidence     Some confidence  296

Leadership <- Leadership %>% 

  group_by(`Confidence_in_world_leaders`, Level) %>% 
  summarise(Freq = n()) %>% 
 
  mutate(percentage = formattable::percent(Freq / sum(Freq)))
head(Leadership)

# A tibble: 6 × 4
# Groups:   Confidence_in_world_leaders [2]
  Confidence_in_world_leaders Level                    Freq percentage
  <chr>                       <fct>                   <int> <formttbl>
1 Chinese_Jinping_Confidence  A lot of confidence        29 1.12%     
2 Chinese_Jinping_Confidence  Some confidence           296 11.40%    
3 Chinese_Jinping_Confidence  Not too much confidence  1043 40.18%    
4 Chinese_Jinping_Confidence  No confidence at all     1170 45.07%    
5 Chinese_Jinping_Confidence  Refused                    58 2.23%     
6 French_Macron_Confidence    A lot of confidence       259 9.98%

NEED TO to pull out the “Refused” responses eventually. ###

Leadership Confidence by Political Orientation

polit_lead <- atp_selected %>% 
  select(US_Biden_Confidence, Chinese_Jinping_Confidence, Russian_Putin_Confidence, German_Merkel_Confidence, French_Macron_Confidence, F_PARTYSUMIDEO)

polit_lead <- pivot_longer(polit_lead, cols = c(US_Biden_Confidence, Chinese_Jinping_Confidence, Russian_Putin_Confidence, German_Merkel_Confidence, French_Macron_Confidence), names_to = "confidence_in_world_leaders", values_to = "level")

head(polit_lead)

# A tibble: 6 × 3
  F_PARTYSUMIDEO        confidence_in_world_leaders level             
  <fct>                 <chr>                       <fct>             
1 Conservative Rep/Lean US_Biden_Confidence         Not too much conf…
2 Conservative Rep/Lean Chinese_Jinping_Confidence  No confidence at …
3 Conservative Rep/Lean Russian_Putin_Confidence    No confidence at …
4 Conservative Rep/Lean German_Merkel_Confidence    Not too much conf…
5 Conservative Rep/Lean French_Macron_Confidence    Not too much conf…
6 Conservative Rep/Lean US_Biden_Confidence         No confidence at …

Now, we group variables by world leaders and polit orientation. New dataframe:

polit_lead_group <- polit_lead %>% 
  group_by(confidence_in_world_leaders, F_PARTYSUMIDEO, level) %>% 
 summarise(Freq = n()) %>% 
 
  mutate(percentage = formattable::percent(Freq / sum(Freq)))
head(polit_lead_group)

# A tibble: 6 × 5
# Groups:   confidence_in_world_leaders, F_PARTYSUMIDEO [2]
  confidence_in_world_leaders F_PARTYSUMIDEO    level  Freq percentage
  <chr>                       <fct>             <fct> <int> <formttbl>
1 Chinese_Jinping_Confidence  Conservative Rep… A lo…     4 0.57%     
2 Chinese_Jinping_Confidence  Conservative Rep… Some…    39 5.60%     
3 Chinese_Jinping_Confidence  Conservative Rep… Not …   178 25.57%    
4 Chinese_Jinping_Confidence  Conservative Rep… No c…   462 66.38%    
5 Chinese_Jinping_Confidence  Conservative Rep… Refu…    13 1.87%     
6 Chinese_Jinping_Confidence  Moderate/Liberal… A lo…     3 0.76%

Question 33 - Political Priorities by Party

US_Priorities <- atp_selected %>% 
  select(Reduce_WeaponsMD,American_Jobs,Strengthen_UN,Reduce_USMil_Overseas, Limit_Russia_Power, Promote_Democracy_Overseas, Reduce_Illegal_Immigr, Limit_China_Power, Maintain_USMil_Advantage, Global_Climate_Change)

US_Priorities <- pivot_longer(US_Priorities, cols = c(Reduce_WeaponsMD,American_Jobs,Strengthen_UN,Reduce_USMil_Overseas, Limit_Russia_Power, Promote_Democracy_Overseas, Reduce_Illegal_Immigr, Limit_China_Power, Maintain_USMil_Advantage, Global_Climate_Change), names_to = "FX_Policy_goals", values_to ="Level")

head(US_Priorities)

# A tibble: 6 × 2
  FX_Policy_goals            Level
  <chr>                      <fct>
1 Reduce_WeaponsMD           <NA> 
2 American_Jobs              <NA> 
3 Strengthen_UN              <NA> 
4 Reduce_USMil_Overseas      <NA> 
5 Limit_Russia_Power         <NA> 
6 Promote_Democracy_Overseas <NA>

US Priorities by Political Leaning

Prior_polit_lean <- atp_selected %>% 
  select(Reduce_WeaponsMD,American_Jobs,Strengthen_UN,Reduce_USMil_Overseas, Limit_Russia_Power, Promote_Democracy_Overseas, Reduce_Illegal_Immigr, Limit_China_Power, Maintain_USMil_Advantage, Global_Climate_Change, F_PARTYSUMIDEO)

Prior_polit_lean <- pivot_longer(Prior_polit_lean, cols = c(Reduce_WeaponsMD,American_Jobs,Strengthen_UN,Reduce_USMil_Overseas, Limit_Russia_Power, Promote_Democracy_Overseas, Reduce_Illegal_Immigr, Limit_China_Power, Maintain_USMil_Advantage, Global_Climate_Change), names_to = "FX_Policy-goals", values_to = "level")

head(Prior_polit_lean)

# A tibble: 6 × 3
  F_PARTYSUMIDEO        `FX_Policy-goals`          level
  <fct>                 <chr>                      <fct>
1 Conservative Rep/Lean Reduce_WeaponsMD           <NA> 
2 Conservative Rep/Lean American_Jobs              <NA> 
3 Conservative Rep/Lean Strengthen_UN              <NA> 
4 Conservative Rep/Lean Reduce_USMil_Overseas      <NA> 
5 Conservative Rep/Lean Limit_Russia_Power         <NA> 
6 Conservative Rep/Lean Promote_Democracy_Overseas <NA>

Adds a new column that calculates percentage for every group [After summarize assumes percentages for every group; or mutate(percent = freq/sum(freq))]

Prior_polit_lean_sum <- Prior_polit_lean %>% 
  group_by(`FX_Policy-goals`, F_PARTYSUMIDEO, level) %>% 
  summarise(freq = n()) %>% 
  filter(!is.na(level)) %>% 
  mutate(percentage = formattable::percent(freq / sum(freq)))
head(Prior_polit_lean_sum)

# A tibble: 6 × 5
# Groups:   FX_Policy-goals, F_PARTYSUMIDEO [2]
  `FX_Policy-goals` F_PARTYSUMIDEO            level    freq percentage
  <chr>             <fct>                     <fct>   <int> <formttbl>
1 American_Jobs     Conservative Rep/Lean     Top pr…   326 90.30%    
2 American_Jobs     Conservative Rep/Lean     Some p…    29 8.03%     
3 American_Jobs     Conservative Rep/Lean     No pri…     4 1.11%     
4 American_Jobs     Conservative Rep/Lean     Refused     2 0.55%     
5 American_Jobs     Moderate/Liberal Rep/Lean Top pr…   160 83.77%    
6 American_Jobs     Moderate/Liberal Rep/Lean Some p…    26 13.61%

NB: filter(F_PARTYSUMIDEO %in% c(“Conservative Rep/Lean”,“Liberal Dem/Lean”)) %>%

Shorthand: Adding a new column using$ Prior_polit_lean_sumF$Percentage <-

Question 35 (Intl_Cooperation by Political Lean)

This data looks good, need to visualize. First, we want to show as a percentage -

US_cooperation_polit_lean <- atp_selected %>% 
  select(Intl_Collaboration, F_PARTYSUMIDEO) %>% 

  filter(Intl_Collaboration %in% c("MANY of the problems facing our country can be solved by working with other countries","FEW of the problems facing our country can be solved by working with other countries")) %>% 
  group_by(`Intl_Collaboration`, F_PARTYSUMIDEO)%>% 
summarise(freq = n()) %>% 
    mutate(percentage = formattable::percent(freq / sum(freq)))
head(US_cooperation_polit_lean)

# A tibble: 6 × 4
# Groups:   Intl_Collaboration [2]
  Intl_Collaboration                   F_PARTYSUMIDEO  freq percentage
  <fct>                                <fct>          <int> <formttbl>
1 MANY of the problems facing our cou… Conservative …   188 13.19%    
2 MANY of the problems facing our cou… Moderate/Libe…   156 10.95%    
3 MANY of the problems facing our cou… Moderate/Cons…   483 33.89%    
4 MANY of the problems facing our cou… Liberal Dem/L…   543 38.11%    
5 MANY of the problems facing our cou… Refused eithe…    55 3.86%     
6 FEW of the problems facing our coun… Conservative …   501 43.95%

Question 33 and 35 - The proportion of people who rate a FX_Policy-goal a top priority (Q33), based on how they answered Intl-Collaboration (Q35)

intl <- atp_selected %>%
select(c(9:19)) 

intl <- pivot_longer(intl, cols = c(Reduce_WeaponsMD,American_Jobs,Strengthen_UN,
                                    Reduce_USMil_Overseas, Limit_Russia_Power, Promote_Democracy_Overseas,
                                    Reduce_Illegal_Immigr, Limit_China_Power, Maintain_USMil_Advantage,
                                    Global_Climate_Change), names_to = "FX_Policy-goals", values_to ="Level") %>%             group_by(`FX_Policy-goals`,`Intl_Collaboration`) %>%
  filter(Intl_Collaboration %in% c("MANY of the problems facing our country can be solved by working with other countries","FEW of the problems facing our country can be solved by working with other countries")) %>%
summarise(freq = n()) %>% 
    mutate(percentage = formattable::percent(freq / sum(freq)))
head(intl)

# A tibble: 6 × 4
# Groups:   FX_Policy-goals [3]
  `FX_Policy-goals`     Intl_Collaboration             freq percentage
  <chr>                 <fct>                         <int> <formttbl>
1 American_Jobs         MANY of the problems facing …  1425 55.56%    
2 American_Jobs         FEW of the problems facing o…  1140 44.44%    
3 Global_Climate_Change MANY of the problems facing …  1425 55.56%    
4 Global_Climate_Change FEW of the problems facing o…  1140 44.44%    
5 Limit_China_Power     MANY of the problems facing …  1425 55.56%    
6 Limit_China_Power     FEW of the problems facing o…  1140 44.44%

Potential Research Questions

With these and possibly a few more questions, I want to address the following research questions:

What kind of role should the US have abroad?

What do Americans expect from their government in terms of world leadership?

Should we be actively involving ourselves in world affairs, or more focused on problems at home?

What should US policy priorities be focused on - more outward facing or inward facing issues?

Finally, in light of Russian aggression toward Ukraine, what are American’s opinions of key world leaders (including US and Russian leaders) Are there notable difference by political persuasion?

Comment on this article Share:

Homework_3

Q19 asks about confidence levels of Americans with various world leaders, including US President Biden. Rename variables:

Leadership Confidence by Political Orientation

Question 33 - Political Priorities by Party

Question 35 (Intl_Cooperation by Political Lean)

Question 33 and 35 - The proportion of people who rate a FX_Policy-goal a top priority (Q33), based on how they answered Intl-Collaboration (Q35)

Reuse

Citation