DACSS-601
educ age.f polviews
12th grade:540 30 : 47 Moderate :713
4 years :307 32 : 47 Conserv :292
2 years :261 55 : 47 SlghtCons:268
1 yr coll :163 42 : 43 Liberal :244
11th grade:101 49 : 43 SlghtLib :208
(Other) :600 (Other):1742 (Other) :149
NA's : 2 NA's : 5 NA's :100
The table above is a summary of the data I’ll be working with in this project.
For my final project, I will be using the ‘poliscidata’ package in R. In Homework 5 I used ‘polviews’, ‘degree’, and ‘sex’ to examine the impact sex and education have on a person’s political views. I will put the best visualization I was able to create for said variables below, but I will also present some visualizations in which I swap ‘sex’ and ‘degree’ for ‘age.f’ and ‘educ’ (highest education level).
Here is the best visualization I was able to construct for ‘degree’, ‘sex’, and ‘polviews’.
gss_refined<-gss%>%
select(sex,polviews,degree)
ggplot(gss_refined)+geom_jitter(aes(x=degree, y=polviews,color=sex)) +
labs(x="Highest Degree Awarded",y="Political Views") +
facet_grid()
There obviously isn’t any linear relationship here, but it is noteworthy that, by view alone, respondents’ political views do not appear to be biased based on their sex.
Now I’m going to present some visualizations with the swapped variables (‘age.f’ and ‘educ’ rather than ‘sex’ and ‘degree’).
polviews age.f educ
Moderate :713 30 : 47 12th grade:540
Conserv :292 32 : 47 4 years :307
SlghtCons:268 55 : 47 2 years :261
Liberal :244 42 : 43 1 yr coll :163
SlghtLib :208 49 : 43 11th grade:101
(Other) :149 (Other):1742 (Other) :600
NA's :100 NA's : 5 NA's : 2
The below visualization is, in my opinion, the most precise. Each categorical variable is shown (in a large enough space to be seen clearly!!) in relation to the numerical variable ‘age.f’. There is a noticeably higher number of respondents from a range of education levels who identify as politically moderate. However, based on the colors corresponding to ‘Highest Year of School’, most of these individuals have between an 11th grade education and 2 years of college.
I would also like to emphasize the respondents’ education in different political view categories. Many dots in the “Liberal”, “SlghtLib”, “SlghtCons”, and “Conserv” categories, interestingly, correspond to the highest levels of education (4 years of college plus graduate education).
ggplot(gss_refined2) + geom_jitter(aes(x=age.f, y=polviews,color=educ),size=1.5) +
labs(x="Respondent Age",y="Political Views",color="Highest Year of School")+
facet_grid() + coord_flip()
ggplot(gss_refined2) + geom_col(aes(x=polviews,y=age.f,fill=educ))+coord_flip()
There’s a LOT of info in each of these variables so I’m going to filter them down to make any graphs more readable. I am also simply not a fan of how the above visualization looks (it’s messy and not precise enough to conduct any sort of measurements). I am going to filter ‘age.f’ so that the range will be restricted to 26-45 years old (millennials).
gss_refined.age <-gss_refined2%>%
filter(age.f==c(26:45))
ggplot(gss_refined.age)+geom_jitter(aes(x=polviews,y=educ,color=age.f))+coord_flip()+labs(x="Political Views",y="Years of College",fill="Age")
I also wanted to filter ‘educ’ down to 12th grade-4 years of college, but for some reason that process only leaves one point on the graph. So for explanatory purposes I will be leaving the graph as it appears above.
An interesting point about this visualization, though, is that the dots representing respondents aged 43-45 do not appear on the graph until the 12th grade marker. Additionally most of the dots representing individuals younger than 35 are kind of skewed to the left side of the graph (less than a high school education).
HW Questions
1. I don’t think anything is missing per se but I still think I can either filter things down a bit more OR create a couple more visualizations that represent different aspects of my topic (i.e. the average age of a respondent who identifies as ‘Moderate’).
2. I hope to be able to finish my analysis and kind of get everything into place by submission time.
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Popiela (2022, May 4). Data Analytics and Computational Social Science: HW6. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httprpubscomkpopiela895757/
BibTeX citation
@misc{popiela2022hw6, author = {Popiela, Katie}, title = {Data Analytics and Computational Social Science: HW6}, url = {https://github.com/DACSS/dacss_course_website/posts/httprpubscomkpopiela895757/}, year = {2022} }