HW3

DACSS 601 Data Science Fundamentals - Homework 3

Apoorva Hungund
2022-02-14
My research focuses on Human Factors in Driving and Safety while Driving. The database most relevant to my study is the 'Seatbelts' 
data, which provides data to measure differences in driving deaths over the years. The main aim was to understand change in 
behavior after the introduction of seatbelt legislation on January 31st 1983. The dataset was officially commissioned by Department 
of Transport in 1984 and covers a time period of 15 years (1969 - 1984). Since it's almost 40 years old, we'll have to clean up the 
data and make it a bit more easy to read.

My analysis will work on understanding if:
1) Is there a change in driving deaths once the legislation was established?

2) Is there a change in front seat passengers injured or killed once the legislation was established?

3) Is there a change in rear seat passengers injured or killed once the legislation was established?
data(Seatbelts)
Seatbelts <- data.frame(years=floor(time(Seatbelts)),months=factor(cycle(Seatbelts),labels=month.abb), Seatbelts)
head(Seatbelts)
  years months DriversKilled drivers front rear   kms PetrolPrice
1  1969    Jan           107    1687   867  269  9059   0.1029718
2  1969    Feb            97    1508   825  265  7685   0.1023630
3  1969    Mar           102    1507   806  319  9963   0.1020625
4  1969    Apr            87    1385   814  407 10955   0.1008733
5  1969    May           119    1632   991  454 11823   0.1010197
6  1969    Jun           106    1511   945  427 12391   0.1005812
  VanKilled law
1        12   0
2         6   0
3        12   0
4         8   0
5        10   0
6        13   0
dim(Seatbelts)
[1] 192  10
summary(Seatbelts)
     years          months   DriversKilled      drivers    
 Min.   :1969   Jan    :16   Min.   : 60.0   Min.   :1057  
 1st Qu.:1973   Feb    :16   1st Qu.:104.8   1st Qu.:1462  
 Median :1976   Mar    :16   Median :118.5   Median :1631  
 Mean   :1976   Apr    :16   Mean   :122.8   Mean   :1670  
 3rd Qu.:1980   May    :16   3rd Qu.:138.0   3rd Qu.:1851  
 Max.   :1984   Jun    :16   Max.   :198.0   Max.   :2654  
                (Other):96                                 
     front             rear            kms         PetrolPrice     
 Min.   : 426.0   Min.   :224.0   Min.   : 7685   Min.   :0.08118  
 1st Qu.: 715.5   1st Qu.:344.8   1st Qu.:12685   1st Qu.:0.09258  
 Median : 828.5   Median :401.5   Median :14987   Median :0.10448  
 Mean   : 837.2   Mean   :401.2   Mean   :14994   Mean   :0.10362  
 3rd Qu.: 950.8   3rd Qu.:456.2   3rd Qu.:17202   3rd Qu.:0.11406  
 Max.   :1299.0   Max.   :646.0   Max.   :21626   Max.   :0.13303  
                                                                   
   VanKilled           law        
 Min.   : 2.000   Min.   :0.0000  
 1st Qu.: 6.000   1st Qu.:0.0000  
 Median : 8.000   Median :0.0000  
 Mean   : 9.057   Mean   :0.1198  
 3rd Qu.:12.000   3rd Qu.:0.0000  
 Max.   :17.000   Max.   :1.0000  
                                  
##We'll next be seperating the dataset into two dataframes - before and after legislation was established.
Law_Active<-subset(Seatbelts, law == 1, select = c(years:VanKilled))
Law_Inactive<-subset(Seatbelts, law == 0, select = c(years:VanKilled))

summary(Law_Inactive)
     years          months   DriversKilled      drivers    
 Min.   :1969   Jan    :15   Min.   : 79.0   Min.   :1309  
 1st Qu.:1972   Feb    :14   1st Qu.:108.0   1st Qu.:1511  
 Median :1976   Mar    :14   Median :121.0   Median :1653  
 Mean   :1976   Apr    :14   Mean   :125.9   Mean   :1718  
 3rd Qu.:1979   May    :14   3rd Qu.:140.0   3rd Qu.:1926  
 Max.   :1983   Jun    :14   Max.   :198.0   Max.   :2654  
                (Other):84                                 
     front             rear            kms         PetrolPrice     
 Min.   : 567.0   Min.   :224.0   Min.   : 7685   Min.   :0.08118  
 1st Qu.: 767.0   1st Qu.:344.0   1st Qu.:12387   1st Qu.:0.09078  
 Median : 860.0   Median :401.0   Median :14455   Median :0.10273  
 Mean   : 873.5   Mean   :400.3   Mean   :14463   Mean   :0.10187  
 3rd Qu.: 986.0   3rd Qu.:454.0   3rd Qu.:16585   3rd Qu.:0.11132  
 Max.   :1299.0   Max.   :646.0   Max.   :21040   Max.   :0.13303  
                                                                   
   VanKilled     
 Min.   : 2.000  
 1st Qu.: 7.000  
 Median :10.000  
 Mean   : 9.586  
 3rd Qu.:13.000  
 Max.   :17.000  
                 
dim(Law_Inactive)
[1] 169   9
##There are only 23 observations made after the legislation was established.
dim(Law_Active)
[1] 23  9
summary(Law_Active)
     years          months   DriversKilled      drivers    
 Min.   :1983   Feb    : 2   Min.   : 60.0   Min.   :1057  
 1st Qu.:1983   Mar    : 2   1st Qu.: 85.0   1st Qu.:1171  
 Median :1984   Apr    : 2   Median : 92.0   Median :1282  
 Mean   :1984   May    : 2   Mean   :100.3   Mean   :1322  
 3rd Qu.:1984   Jun    : 2   3rd Qu.:119.0   3rd Qu.:1464  
 Max.   :1984   Jul    : 2   Max.   :154.0   Max.   :1763  
                (Other):11                                 
     front            rear            kms         PetrolPrice    
 Min.   :426.0   Min.   :296.0   Min.   :15511   Min.   :0.1131  
 1st Qu.:516.0   1st Qu.:347.0   1st Qu.:17971   1st Qu.:0.1148  
 Median :585.0   Median :408.0   Median :19162   Median :0.1161  
 Mean   :571.0   Mean   :407.7   Mean   :18890   Mean   :0.1165  
 3rd Qu.:629.5   3rd Qu.:471.5   3rd Qu.:19952   3rd Qu.:0.1180  
 Max.   :721.0   Max.   :521.0   Max.   :21626   Max.   :0.1201  
                                                                 
   VanKilled    
 Min.   :2.000  
 1st Qu.:3.500  
 Median :5.000  
 Mean   :5.174  
 3rd Qu.:7.000  
 Max.   :8.000  
                
##We'll remove variables we don't need to create final datasets for before and after the legislation was activated.

Law_Inactive <- Law_Inactive[ -c(7:9) ]
head(Law_Active)
    years months DriversKilled drivers front rear   kms PetrolPrice
170  1983    Feb            95    1057   426  300 15511   0.1136570
171  1983    Mar           100    1218   475  318 18308   0.1131444
172  1983    Apr            89    1168   556  391 17793   0.1184955
173  1983    May            82    1236   559  398 19205   0.1179694
174  1983    Jun            89    1076   483  337 19162   0.1176866
175  1983    Jul            60    1174   587  477 20997   0.1200592
    VanKilled
170         3
171         2
172         6
173         3
174         7
175         6
Law_Inactive <- Law_Inactive[ -c(7:9) ]
head(Law_Inactive)
  years months DriversKilled drivers front rear
1  1969    Jan           107    1687   867  269
2  1969    Feb            97    1508   825  265
3  1969    Mar           102    1507   806  319
4  1969    Apr            87    1385   814  407
5  1969    May           119    1632   991  454
6  1969    Jun           106    1511   945  427
References:
           
           1. https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/time
           2. https://rdrr.io/r/datasets/UKDriverDeaths.html
           3. https://rpubs.com/Vikki_Grist/299410
           

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Hungund (2022, Feb. 16). Data Analytics and Computational Social Science: HW3. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomahungundaphhw3/

BibTeX citation

@misc{hungund2022hw3,
  author = {Hungund, Apoorva},
  title = {Data Analytics and Computational Social Science: HW3},
  url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomahungundaphhw3/},
  year = {2022}
}