Homework 3, DACSS 601
#1. Identify the dataset you will be using for the final project.
I will be utilizing speed data from a roadway analytics company. INRIX can provide data from connected vehicles and deliver anonymized speed data aggregated to specific time intervals at predetermined locations.
This data is commonly used to monitor a transportation network to determine locations where congestion may be occuring in real-time, or is a reoccuring problem.
This data can be combined with TMC (Traffic Management Channel) locations to be mapped.
I have an archived copy of the TMC shapefile for mapping and determining roadway types (Interstate, US Route, State Route, Local). I also have access to a year of hourly speed data for each location.
tbl_IA_1HR_Speed_AVG_INRIX <- read_csv(unz('G:/My Drive/School/UMASS/DACSS/DACSS_601/Final Project/Data/Bottlenecks_05062020.zip','Bottlenecks_05062020.csv' ))
head(tbl_IA_1HR_Speed_AVG_INRIX)
# A tibble: 6 x 8
tmc_code measurement_tstamp speed average_speed reference_speed
<chr> <dttm> <dbl> <dbl> <dbl>
1 118+11507 2018-01-01 02:00:00 53.5 55 55
2 118+11507 2018-01-01 04:00:00 48.2 55 55
3 118+11507 2018-01-01 05:00:00 30 53 55
4 118+11507 2018-01-01 07:00:00 55.4 55 55
5 118+11507 2018-01-01 08:00:00 49.5 56 55
6 118+11507 2018-01-01 09:00:00 47.7 55 55
# ... with 3 more variables: travel_time_minutes <dbl>,
# confidence_score <dbl>, cvalue <dbl>
TMC_CODE: This is the location where the speed is recorded
measurement_tstamp: the date and time that the measurement was recorded
speed: average 1-hour speed
average_speed: average speed within a 24 hr period
reference_speed: recorded speeds on TMC between 10PM and 6AM (uncongested)
travel_time_minutes: average time a vehicle spent traversing the TMC during the specified time period
confidence_score: indicates the source of a data record. 20(historical), 30(real-time)
C-value: the confidence of the reading
#2. Read in/clean the dataset.
A TMC can be said to be ‘bottlenecked’ when the travel speed is 60 percent of the free-flow speed.
We also must import the TMC shapefile:
shp_IA_TMC<-st_read('G:/My Drive/School/UMASS/DACSS/DACSS_601/Final Project/Data/Iowa_2018_TMC_shapefile/Iowa.shp')
Reading layer `Iowa' from data source
`G:\My Drive\School\UMASS\DACSS\DACSS_601\Final Project\Data\Iowa_2018_TMC_shapefile\Iowa.shp'
using driver `ESRI Shapefile'
Simple feature collection with 4862 features and 36 fields
Geometry type: MULTILINESTRING
Dimension: XY
Bounding box: xmin: -96.61288 ymin: 40.38535 xmax: -90.17986 ymax: 43.50076
Geodetic CRS: WGS 84
plot(shp_IA_TMC,col = "blue", main="TMC Locations in Iowa",max.plot=1)
#3. Identify potential research questions that your dataset can help answer.
“Which routes in Iowa, when compared to similar routes of the same type (Interstate, US, IA, local), is congestion significantly worse?
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Meade (2022, Feb. 16). Data Analytics and Computational Social Science: Homework 3. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscommeade68867027/
BibTeX citation
@misc{meade2022homework, author = {Meade, Justin}, title = {Data Analytics and Computational Social Science: Homework 3}, url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscommeade68867027/}, year = {2022} }