DACSS 601 Data Science Fundamentals - Homework 3
My research focuses on Human Factors in Driving and Safety while Driving. The database most relevant to my study is the 'Seatbelts'
data, which provides data to measure differences in driving deaths over the years. The main aim was to understand change in
behavior after the introduction of seatbelt legislation on January 31st 1983. The dataset was officially commissioned by Department
of Transport in 1984 and covers a time period of 15 years (1969 - 1984). Since it's almost 40 years old, we'll have to clean up the
data and make it a bit more easy to read.
My analysis will work on understanding if:
1) Is there a change in driving deaths once the legislation was established?
2) Is there a change in front seat passengers injured or killed once the legislation was established?
3) Is there a change in rear seat passengers injured or killed once the legislation was established?
data(Seatbelts)
Seatbelts <- data.frame(years=floor(time(Seatbelts)),months=factor(cycle(Seatbelts),labels=month.abb), Seatbelts)
head(Seatbelts)
years months DriversKilled drivers front rear kms PetrolPrice
1 1969 Jan 107 1687 867 269 9059 0.1029718
2 1969 Feb 97 1508 825 265 7685 0.1023630
3 1969 Mar 102 1507 806 319 9963 0.1020625
4 1969 Apr 87 1385 814 407 10955 0.1008733
5 1969 May 119 1632 991 454 11823 0.1010197
6 1969 Jun 106 1511 945 427 12391 0.1005812
VanKilled law
1 12 0
2 6 0
3 12 0
4 8 0
5 10 0
6 13 0
dim(Seatbelts)
[1] 192 10
summary(Seatbelts)
years months DriversKilled drivers
Min. :1969 Jan :16 Min. : 60.0 Min. :1057
1st Qu.:1973 Feb :16 1st Qu.:104.8 1st Qu.:1462
Median :1976 Mar :16 Median :118.5 Median :1631
Mean :1976 Apr :16 Mean :122.8 Mean :1670
3rd Qu.:1980 May :16 3rd Qu.:138.0 3rd Qu.:1851
Max. :1984 Jun :16 Max. :198.0 Max. :2654
(Other):96
front rear kms PetrolPrice
Min. : 426.0 Min. :224.0 Min. : 7685 Min. :0.08118
1st Qu.: 715.5 1st Qu.:344.8 1st Qu.:12685 1st Qu.:0.09258
Median : 828.5 Median :401.5 Median :14987 Median :0.10448
Mean : 837.2 Mean :401.2 Mean :14994 Mean :0.10362
3rd Qu.: 950.8 3rd Qu.:456.2 3rd Qu.:17202 3rd Qu.:0.11406
Max. :1299.0 Max. :646.0 Max. :21626 Max. :0.13303
VanKilled law
Min. : 2.000 Min. :0.0000
1st Qu.: 6.000 1st Qu.:0.0000
Median : 8.000 Median :0.0000
Mean : 9.057 Mean :0.1198
3rd Qu.:12.000 3rd Qu.:0.0000
Max. :17.000 Max. :1.0000
##We'll next be seperating the dataset into two dataframes - before and after legislation was established.
Law_Active<-subset(Seatbelts, law == 1, select = c(years:VanKilled))
Law_Inactive<-subset(Seatbelts, law == 0, select = c(years:VanKilled))
summary(Law_Inactive)
years months DriversKilled drivers
Min. :1969 Jan :15 Min. : 79.0 Min. :1309
1st Qu.:1972 Feb :14 1st Qu.:108.0 1st Qu.:1511
Median :1976 Mar :14 Median :121.0 Median :1653
Mean :1976 Apr :14 Mean :125.9 Mean :1718
3rd Qu.:1979 May :14 3rd Qu.:140.0 3rd Qu.:1926
Max. :1983 Jun :14 Max. :198.0 Max. :2654
(Other):84
front rear kms PetrolPrice
Min. : 567.0 Min. :224.0 Min. : 7685 Min. :0.08118
1st Qu.: 767.0 1st Qu.:344.0 1st Qu.:12387 1st Qu.:0.09078
Median : 860.0 Median :401.0 Median :14455 Median :0.10273
Mean : 873.5 Mean :400.3 Mean :14463 Mean :0.10187
3rd Qu.: 986.0 3rd Qu.:454.0 3rd Qu.:16585 3rd Qu.:0.11132
Max. :1299.0 Max. :646.0 Max. :21040 Max. :0.13303
VanKilled
Min. : 2.000
1st Qu.: 7.000
Median :10.000
Mean : 9.586
3rd Qu.:13.000
Max. :17.000
dim(Law_Inactive)
[1] 169 9
##There are only 23 observations made after the legislation was established.
dim(Law_Active)
[1] 23 9
summary(Law_Active)
years months DriversKilled drivers
Min. :1983 Feb : 2 Min. : 60.0 Min. :1057
1st Qu.:1983 Mar : 2 1st Qu.: 85.0 1st Qu.:1171
Median :1984 Apr : 2 Median : 92.0 Median :1282
Mean :1984 May : 2 Mean :100.3 Mean :1322
3rd Qu.:1984 Jun : 2 3rd Qu.:119.0 3rd Qu.:1464
Max. :1984 Jul : 2 Max. :154.0 Max. :1763
(Other):11
front rear kms PetrolPrice
Min. :426.0 Min. :296.0 Min. :15511 Min. :0.1131
1st Qu.:516.0 1st Qu.:347.0 1st Qu.:17971 1st Qu.:0.1148
Median :585.0 Median :408.0 Median :19162 Median :0.1161
Mean :571.0 Mean :407.7 Mean :18890 Mean :0.1165
3rd Qu.:629.5 3rd Qu.:471.5 3rd Qu.:19952 3rd Qu.:0.1180
Max. :721.0 Max. :521.0 Max. :21626 Max. :0.1201
VanKilled
Min. :2.000
1st Qu.:3.500
Median :5.000
Mean :5.174
3rd Qu.:7.000
Max. :8.000
##We'll remove variables we don't need to create final datasets for before and after the legislation was activated.
Law_Inactive <- Law_Inactive[ -c(7:9) ]
head(Law_Active)
years months DriversKilled drivers front rear kms PetrolPrice
170 1983 Feb 95 1057 426 300 15511 0.1136570
171 1983 Mar 100 1218 475 318 18308 0.1131444
172 1983 Apr 89 1168 556 391 17793 0.1184955
173 1983 May 82 1236 559 398 19205 0.1179694
174 1983 Jun 89 1076 483 337 19162 0.1176866
175 1983 Jul 60 1174 587 477 20997 0.1200592
VanKilled
170 3
171 2
172 6
173 3
174 7
175 6
years months DriversKilled drivers front rear
1 1969 Jan 107 1687 867 269
2 1969 Feb 97 1508 825 265
3 1969 Mar 102 1507 806 319
4 1969 Apr 87 1385 814 407
5 1969 May 119 1632 991 454
6 1969 Jun 106 1511 945 427
References:
1. https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/time
2. https://rdrr.io/r/datasets/UKDriverDeaths.html
3. https://rpubs.com/Vikki_Grist/299410
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Hungund (2022, Feb. 16). Data Analytics and Computational Social Science: HW3. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomahungundaphhw3/
BibTeX citation
@misc{hungund2022hw3, author = {Hungund, Apoorva}, title = {Data Analytics and Computational Social Science: HW3}, url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomahungundaphhw3/}, year = {2022} }