challenge_1
railroads_2012_clean_county.csv
Author

Shoshana Buck

Published

August 16, 2022

Code
library(tidyverse)
library(readr)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Challenge Overview

Read in the Data

Code
railroad <- read_csv("_data/railroad_2012_clean_county.csv")
railroad
# A tibble: 2,930 × 3
   state county               total_employees
   <chr> <chr>                          <dbl>
 1 AE    APO                                2
 2 AK    ANCHORAGE                          7
 3 AK    FAIRBANKS NORTH STAR               2
 4 AK    JUNEAU                             3
 5 AK    MATANUSKA-SUSITNA                  2
 6 AK    SITKA                              1
 7 AK    SKAGWAY MUNICIPALITY              88
 8 AL    AUTAUGA                          102
 9 AL    BALDWIN                          143
10 AL    BARBOUR                            1
# … with 2,920 more rows
# ℹ Use `print(n = ...)` to see more rows

This data is separated into three columns: state, county, and total employees and 2,930 rows.

Describe the data

I imported the data set of railroad_2012_clean_county.csv and renamed it as Railroad. I then used the function colnames() to breakdown the three column names of “state” “county” and “total_employees.” From there I used the spec() function to extract the column names. I then used a pipe function in order to filter and select to see the total amount of employees in each state.

Code
colnames(railroad)
[1] "state"           "county"          "total_employees"
Code
spec(railroad)
cols(
  state = col_character(),
  county = col_character(),
  total_employees = col_double()
)
Code
head(railroad)
# A tibble: 6 × 3
  state county               total_employees
  <chr> <chr>                          <dbl>
1 AE    APO                                2
2 AK    ANCHORAGE                          7
3 AK    FAIRBANKS NORTH STAR               2
4 AK    JUNEAU                             3
5 AK    MATANUSKA-SUSITNA                  2
6 AK    SITKA                              1
Code
railroad %>% 
  group_by(state) %>% 
  summarise(total_employees2=sum(total_employees))
# A tibble: 53 × 2
   state total_employees2
   <chr>            <dbl>
 1 AE                   2
 2 AK                 103
 3 AL                4257
 4 AP                   1
 5 AR                3871
 6 AZ                3153
 7 CA               13137
 8 CO                3650
 9 CT                2592
10 DC                 279
# … with 43 more rows
# ℹ Use `print(n = ...)` to see more rows