HW2
strikes
Reading in Data
Author

Danny Holt

Published

June 13, 2023

Code
library(tidyverse)
library(readxl)

knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE)

Read in a dataset

First, we’ll read in the data from the Excel sheet monthly-listing.xlsx.

Source: https://www.bls.gov/web/wkstp/monthly-listing.xlsx

Code
  #Read in data
  strike <- read_excel("_data/monthly-listing.xlsx", skip=1)
  strike

Clean the data

Next, we’ll clean the data. First, we’ll remove redundant and unnecessary variables. Then, we’ll rename some variables with cumbersome or confusing titles.

Code
  # remove redundant and unnecessary variables
  strike <- strike %>%
    select("Organizations involved","States","Areas","Ownership","Union acronym","Work stoppage beginning date","Work stoppage ending date","Number of workers[2]","Days idle, cumulative for this work stoppage[3]")
  # rename variables
  strike <- strike %>%
    rename(
      "Employer"="Organizations involved",
      "Union"="Union acronym",
      "Start date"="Work stoppage beginning date",
      "End date"="Work stoppage ending date",
      "Workers"="Number of workers[2]",
      "Days struck"="Days idle, cumulative for this work stoppage[3]"
    )
  # view cleaned data
  strike

Provide a narrative about the data set

This data set shows data on work stoppages (strikes) from the Bureau of Labor Statistics, from 1993 to the present.

Variables

Employer shows the employer of the workers striking. This variable is categorical. States shows the state or state(s) in which the strike took place. This variable is categorical. Areas shows a more specific location of the strike, at a lower level than the state(s). This variable is categorical. Ownership tells what type of entity the employer is of the following options: private industry, local and/or state government. This variable is categorical. Union shows the acronym of the union to which the striking workers belonged. This variable is categorical. Start date shows the date on which the strike began. This column contains numerical, discrete, interval data. End date shows the date on which the strike ended. This column contains numerical, discrete, interval data. Workers shows the number of workers who went on strike. This column contains numerical, discrete, ratio data. Days struck shows the total number of hours of labor workers withheld during the strike. This is distinct from the difference between Start date and End date because, in some strikes, some workers go back to work before the strike ends. This column contains numerical, discrete, ratio data.

Some more potential research questions

Has the average strike size and length changed over time?

How have strikes changed in length, frequency, and magnitude (number of workers and days struck) within the private sector and within the public sector, viewed separately?

Are there significant geographic trends?

Do large strikes in an area or industry tend to be followed by more strikes in that area or industry?