Reading in and Wrangling Data
str(australian_marriage_data)
spec_tbl_df [16 × 4] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ territory: chr [1:16] "New South Wales" "New South Wales" "Victoria" "Victoria" ...
$ resp : chr [1:16] "yes" "no" "yes" "no" ...
$ count : num [1:16] 2374362 1736838 2145629 1161098 1487060 ...
$ percent : num [1:16] 57.8 42.2 64.9 35.1 60.7 39.3 62.5 37.5 63.7 36.3 ...
- attr(*, "spec")=
.. cols(
.. territory = col_character(),
.. resp = col_character(),
.. count = col_double(),
.. percent = col_double()
.. )
- attr(*, "problems")=<externalptr>
“australian_marriage_data” has 4 variables. The first variable (territory) is the territory of the respondent, it is a character. The second variable (resp) is the response, either “yes” or “no,” it is also a character. The third variable (count) is the total number of responses either “yes” or “no” respectively, it is numeric. The fourth variable (percent) is also numeric, it is the percent of respondents who reported either “yes” or “no.”
Filter “yes” from “australian_marriage_data”filter(australian_marriage_data, `resp` == "yes")
# A tibble: 8 × 4
territory resp count percent
<chr> <chr> <dbl> <dbl>
1 New South Wales yes 2374362 57.8
2 Victoria yes 2145629 64.9
3 Queensland yes 1487060 60.7
4 South Australia yes 592528 62.5
5 Western Australia yes 801575 63.7
6 Tasmania yes 191948 63.6
7 Northern Territory(b) yes 48686 60.6
8 Australian Capital Territory(c) yes 175459 74
# A tibble: 8 × 4
territory resp count percent
<chr> <chr> <dbl> <dbl>
1 New South Wales yes 2374362 57.8
2 Victoria yes 2145629 64.9
3 Queensland yes 1487060 60.7
4 Western Australia yes 801575 63.7
5 South Australia yes 592528 62.5
6 Tasmania yes 191948 63.6
7 Australian Capital Territory(c) yes 175459 74
8 Northern Territory(b) yes 48686 60.6
filter(australian_marriage_data, `resp` == "yes") %>%
select(territory, resp, percent) %>%
arrange(desc(percent))
# A tibble: 8 × 3
territory resp percent
<chr> <chr> <dbl>
1 Australian Capital Territory(c) yes 74
2 Victoria yes 64.9
3 Western Australia yes 63.7
4 Tasmania yes 63.6
5 South Australia yes 62.5
6 Queensland yes 60.7
7 Northern Territory(b) yes 60.6
8 New South Wales yes 57.8
# A tibble: 6 × 4
territory resp count percent
<chr> <chr> <dbl> <dbl>
1 Victoria yes 2145629 64.9
2 Western Australia yes 801575 63.7
3 Tasmania yes 191948 63.6
4 South Australia yes 592528 62.5
5 Queensland yes 1487060 60.7
6 Northern Territory(b) yes 48686 60.6
boxplot(percent_married, horizontal = TRUE,
main = "Boxplot: Australian Marriage Data: 'yes'",
ylab = "Territories",
xlab = "Percent of respondents who said 'yes' to being married", col = (c("blue")))
```{.r .distill-force-highlighting-css}
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Farrell (2022, Feb. 9). Data Analytics and Computational Social Science: Homework 2. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomjnfarrell211864086/
BibTeX citation
@misc{farrell2022homework, author = {Farrell, Joseph}, title = {Data Analytics and Computational Social Science: Homework 2}, url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomjnfarrell211864086/}, year = {2022} }