Challenge 1

challenge_1

Reading in data and creating a post

Author

Jerin Jacob

Published

December 12, 2022

Code

library(tidyverse)
library(readxl)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Reading Railroad Employees Dataset

Code

library(haven)
library(readr)
railroad <- read_csv("_data/railroad_2012_clean_county.csv")

This is a data set of the rail road employees working in 2930 counties of the states in US in the year of 2012. There are 3 variables in the dataset; state, county and total number of employees.

Describing Railroad Data

Code

view(railroad)
railroad%>%
  select(state)%>%
  n_distinct(.)

[1] 53

Code

railroad%>%
  select(state)%>%
  distinct()

# A tibble: 53 × 1
   state
   <chr>
 1 AE   
 2 AK   
 3 AL   
 4 AP   
 5 AR   
 6 AZ   
 7 CA   
 8 CO   
 9 CT   
10 DC   
# … with 43 more rows

There are 53 distinct values in the variable column named state. This means that there are certain additional values other than the name of the states. The variable ‘state’ contains all the states along with armed forces, DC etc. To find what values are included other than the name of the states, the distinct values of the variable ‘state’ is taken.