Kruzlic HW2 Try 1

DACSS 601 Homework 2 Example

Bryn Kruzlic
2022-02-12

Functions: These are the libraries needed for the dataset

Variables: These are the variables in the dataset railroad

  1. state: state represents the 53 states in America
  2. total_employees: total_employees represents the number of employees in each state

This is how to read in a dataset using read_csv

A ‘tibble’ refers to a data frame

library(readr)
Data <- read_csv("C:/Users/Bryn Kruzlic/OneDrive/Desktop/DACSS601/railroad_2012_clean_state.csv")
col_names = c(chr = "x", dbl = "y")
View(Data)
as_tibble(Data) # A tibble: 10 x 2
# A tibble: 53 x 2
   state total_employees
   <chr>           <dbl>
 1 AE                  2
 2 AK                103
 3 AL               4257
 4 AP                  1
 5 AR               3871
 6 AZ               3153
 7 CA              13137
 8 CO               3650
 9 CT               2592
10 DC                279
# ... with 43 more rows
summary(Data)
    state           total_employees
 Length:53          Min.   :    1  
 Class :character   1st Qu.: 1917  
 Mode  :character   Median : 3379  
                    Mean   : 4819  
                    3rd Qu.: 6092  
                    Max.   :19839  

Perform at least 2 basic data-wrangling operations

library(dplyr)
filter(Data, total_employees > 1000) %>%
  arrange(desc(total_employees))
# A tibble: 42 x 2
   state total_employees
   <chr>           <dbl>
 1 TX              19839
 2 IL              19131
 3 NY              17050
 4 NE              13176
 5 CA              13137
 6 PA              12769
 7 OH               9056
 8 GA               8605
 9 IN               8537
10 MO               8419
# ... with 32 more rows

We are using functions filter() and arrange () to filter out the states with more than 1000 total employees and then arranging them in descending order.


Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Kruzlic (2022, Feb. 13). Data Analytics and Computational Social Science: Kruzlic HW2 Try 1. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscombkruzlichw2attempt1/

BibTeX citation

@misc{kruzlic2022kruzlic,
  author = {Kruzlic, Bryn},
  title = {Data Analytics and Computational Social Science: Kruzlic HW2 Try 1},
  url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscombkruzlichw2attempt1/},
  year = {2022}
}