Describing Cross-sectional Survey Data 2021

hw1
challenge1
Dirichi Umunna
dataset (abc_poll_2021)
ggplot2
Author

Dirichi Umunna

Published

February 12, 2023

Code
library(tidyverse)

knitr::opts_chunk$set(echo = TRUE)
Code
library(readr)
abc_poll_2021 <- read_csv("/Users/dirichiumunna/Documents/DACS601/GITHUB/dacss601_spring2023/posts/_data/abc_poll_2021.csv")
Rows: 527 Columns: 31
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (28): xspanish, complete_status, ppeduc5, ppeducat, ppgender, ppethm, pp...
dbl  (3): id, ppage, weights_pid

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Code
View(abc_poll_2021)
Error in check_for_XQuartz(file.path(R.home("modules"), "R_de.so")): X11 library is missing: install XQuartz from www.xquartz.org

Getting descriptive statistics of the data set

Code
summary(abc_poll_2021)
       id            xspanish         complete_status        ppage      
 Min.   :7230001   Length:527         Length:527         Min.   :18.00  
 1st Qu.:7230132   Class :character   Class :character   1st Qu.:40.00  
 Median :7230264   Mode  :character   Mode  :character   Median :55.00  
 Mean   :7230264                                         Mean   :53.39  
 3rd Qu.:7230396                                         3rd Qu.:67.00  
 Max.   :7230527                                         Max.   :91.00  
   ppeduc5            ppeducat           ppgender            ppethm         
 Length:527         Length:527         Length:527         Length:527        
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
                                                                            
                                                                            
                                                                            
   pphhsize            ppinc7            ppmarit5           ppmsacat        
 Length:527         Length:527         Length:527         Length:527        
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
                                                                            
                                                                            
                                                                            
    ppreg4             pprent            ppstaten           PPWORKA         
 Length:527         Length:527         Length:527         Length:527        
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
                                                                            
                                                                            
                                                                            
   ppemploy             Q1_a               Q1_b               Q1_c          
 Length:527         Length:527         Length:527         Length:527        
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
                                                                            
                                                                            
                                                                            
     Q1_d               Q1_e               Q1_f                Q2           
 Length:527         Length:527         Length:527         Length:527        
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
                                                                            
                                                                            
                                                                            
      Q3                 Q4                 Q5                QPID          
 Length:527         Length:527         Length:527         Length:527        
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
                                                                            
                                                                            
                                                                            
    ABCAGE            Contact           weights_pid    
 Length:527         Length:527         Min.   :0.3240  
 Class :character   Class :character   1st Qu.:0.6332  
 Mode  :character   Mode  :character   Median :0.8451  
                                       Mean   :1.0000  
                                       3rd Qu.:1.1516  
                                       Max.   :6.2553  

Identifying variable name

Code
names(abc_poll_2021)
 [1] "id"              "xspanish"        "complete_status" "ppage"          
 [5] "ppeduc5"         "ppeducat"        "ppgender"        "ppethm"         
 [9] "pphhsize"        "ppinc7"          "ppmarit5"        "ppmsacat"       
[13] "ppreg4"          "pprent"          "ppstaten"        "PPWORKA"        
[17] "ppemploy"        "Q1_a"            "Q1_b"            "Q1_c"           
[21] "Q1_d"            "Q1_e"            "Q1_f"            "Q2"             
[25] "Q3"              "Q4"              "Q5"              "QPID"           
[29] "ABCAGE"          "Contact"         "weights_pid"    
Code
# trying all the means to get a description of the variables in the dataset and the result does not exactly give me the complete names. It seems like I have gone beyond my pay grade for today.

Attempted scatter plot of two variables

Code
plot(abc_poll_2021$ppgender, abc_poll_2021$ppmarit5)
Warning in xy.coords(x, y, xlabel, ylabel, log): NAs introduced by coercion

Warning in xy.coords(x, y, xlabel, ylabel, log): NAs introduced by coercion
Warning in min(x): no non-missing arguments to min; returning Inf
Warning in max(x): no non-missing arguments to max; returning -Inf
Warning in min(x): no non-missing arguments to min; returning Inf
Warning in max(x): no non-missing arguments to max; returning -Inf
Error in plot.window(...): need finite 'xlim' values

Code
#this data needs to be cleaned. To be continued...