Erin Liu HW2

HW2 created using the Distill format.

Erin Liu
2021-12-29

Read in dataset

I used the dataset “railroad_2012_clean_state - Sheet1.csv” from the “Sample Datasets” section on Google Classroom.

HW2_data<- read.csv('/Users/erinliu/Downloads/railroad_2012_clean_state - Sheet1.csv',TRUE,',')
dim(HW2_data)
[1] 53  2

Data types of variables

state: string, the abbreviation of each state

total_employees: numeric, the total number of employees of this state

shape: 53*2

Data wrangling

1. select the states that have total_employees greater than 1000

 filter(select(HW2_data, state,total_employees),total_employees>1000)
   state total_employees
1     AL            4257
2     AR            3871
3     AZ            3153
4     CA           13137
5     CO            3650
6     CT            2592
7     DE            1495
8     FL            7419
9     GA            8605
10    IA            4019
11    ID            1563
12    IL           19131
13    IN            8537
14    KS            6092
15    KY            4811
16    LA            3915
17    MA            3379
18    MD            4709
19    MI            3932
20    MN            5467
21    MO            8419
22    MS            2111
23    MT            3327
24    NC            3143
25    ND            2204
26    NE           13176
27    NJ            8329
28    NM            1958
29    NY           17050
30    OH            9056
31    OK            2318
32    OR            2322
33    PA           12769
34    SC            2296
35    TN            4952
36    TX           19839
37    UT            1917
38    VA            7551
39    WA            5222
40    WI            3773
41    WV            3213
42    WY            2876

2. arrange the data to be in desc order of total_employees and show the top ten

  arrange(select(HW2_data, state,total_employees),desc(total_employees)) %>%
  slice(1:10)
   state total_employees
1     TX           19839
2     IL           19131
3     NY           17050
4     NE           13176
5     CA           13137
6     PA           12769
7     OH            9056
8     GA            8605
9     IN            8537
10    MO            8419

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Liu (2021, Dec. 30). Data Analytics and Computational Social Science: Erin Liu HW2. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomerinliuhw2/

BibTeX citation

@misc{liu2021erin,
  author = {Liu, Erin},
  title = {Data Analytics and Computational Social Science: Erin Liu HW2},
  url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomerinliuhw2/},
  year = {2021}
}