Reading in the railroad dataset
library(readxl)
library(dplyr)
State_county <- read_excel("../../_data/StateCounty2012.xls", skip=3)
The data has been read in. Let’s see how it looks like -
head(State_county)
# A tibble: 6 × 5
STATE ...2 COUNTY ...4 TOTAL
<chr> <lgl> <chr> <lgl> <dbl>
1 AE NA APO NA 2
2 AE Total1 NA <NA> NA 2
3 AK NA ANCHORAGE NA 7
4 AK NA FAIRBANKS NORTH STAR NA 2
5 AK NA JUNEAU NA 3
6 AK NA MATANUSKA-SUSITNA NA 2
tail(State_county)
# A tibble: 6 × 5
STATE ...2 COUNTY ...4 TOTAL
<chr> <lgl> <chr> <lgl> <dbl>
1 <NA> NA <NA> NA NA
2 CANADA NA <NA> NA 662
3 <NA> NA <NA> NA NA
4 1 Military designation. NA <NA> NA NA
5 <NA> NA <NA> NA NA
6 NOTE: Excludes 2,896 employees without an… NA <NA> NA NA
This needs further cleaning to make it suitable for further analysis.
State_county <- select(State_county, -c(2, 4))
State_county <- State_county[complete.cases(State_county),]
head(State_county)
# A tibble: 6 × 3
STATE COUNTY TOTAL
<chr> <chr> <dbl>
1 AE APO 2
2 AK ANCHORAGE 7
3 AK FAIRBANKS NORTH STAR 2
4 AK JUNEAU 3
5 AK MATANUSKA-SUSITNA 2
6 AK SITKA 1
tail(State_county)
# A tibble: 6 × 3
STATE COUNTY TOTAL
<chr> <chr> <dbl>
1 WY SHERIDAN 252
2 WY SUBLETTE 3
3 WY SWEETWATER 196
4 WY UINTA 49
5 WY WASHAKIE 10
6 WY WESTON 37
This looks much better!
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Mohit-Arora (2021, Aug. 17). DACSS 601 August 2021: Railroad Employment data. Retrieved from https://mrolfe.github.io/DACSS601August2021/posts/2021-08-16-railroad-employment-data/
BibTeX citation
@misc{mohit-arora2021railroad, author = {Mohit-Arora, }, title = {DACSS 601 August 2021: Railroad Employment data}, url = {https://mrolfe.github.io/DACSS601August2021/posts/2021-08-16-railroad-employment-data/}, year = {2021} }