abc_poll
More data wrangling: pivoting
Author

Matt Eckstein

Published

March 22, 2023

Code
library(tidyverse)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Challenge Overview

Today’s challenge is to:

  1. read in a data set, and describe the data set using both words and any supporting information (e.g., tables, etc)
  2. tidy data (as needed, including sanity checks)
  3. identify variables that need to be mutated
  4. mutate variables and sanity check all mutations

Read in data

Code
abcpoll <- read.csv("_data/abc_poll_2021.csv")

Briefly describe the data

These data come from a 2021 ABC News poll conducted across the United States, containing an ID number, demographic information, answers to five questions (one of which has six components), and metadata related to the conduct of the survey itself from 527 adult respondents.

Tidy Data (as needed)

These data are relatively tidy; for most purposes, these data are more legible in a wide than a long format.

Identify variables that need to be mutated

Are there any variables that require mutation to be usable in your analysis stream? For example, are all time variables correctly coded as dates? Are all string variables reduced and cleaned to sensible categories? Do you need to turn any variables into factors and reorder for ease of graphics and visualization?

It might be helpful, to facilitate calculations and visualizations, to code certain textual question answers as numeric. Here, I have coded two numeric variations of Q4 (one with a numeric version of the existing scale, and another which combines “Good” and “Excellent as 2 and”Not so good” and “Poor” as 1).

Additionally, though these modifications do not use mutate, I modified the dataset to remove some irrelevant or redundant variables and to drop the unnecessary “A” or “And” in front of the party names in the original dataset.

Code
abcpoll2 <- abcpoll %>%
  select(-c(complete_status, ppemploy, Contact, weights_pid))
Code
abcpoll3 <- abcpoll2 %>% 
  separate(QPID, into = c("delete", "Party ID"),sep = "A |An ")

abcpoll3 <- abcpoll3 %>%
  replace_na(list("Party ID" = "Something else"))

abcpoll3 <- abcpoll3 %>%
  select(-c(delete))

abcpoll3 <- abcpoll3 %>%
  mutate(Q4_num = recode(Q4, "Excellent" = 4, "Good" = 3, "Not so good" = 2, "Poor" = 1))

abcpoll3 <- abcpoll3 %>%
  mutate(Q4_combined = case_when(
    Q4 == "Excellent" | Q4 == "Good" ~ 2, 
    Q4 == "Not so good" | Q4 == "Poor" ~ 1)
  )