Final Project Draft
I have chosen the Emergency - 911 calls dataset from Kaggle (https://www.kaggle.com/mchirico/montcoalert/version/32) for my final project. The dataset contains emergency 911 calls in Montgomery County, Pennsylvania from 2015 to 2020. Below is the code snippet to read and preview the data.
lat lng
1 40.29788 -75.58129
2 40.25806 -75.26468
3 40.12118 -75.35198
4 40.11615 -75.34351
5 40.25149 -75.60335
6 40.25347 -75.28324
desc
1 REINDEER CT & DEAD END; NEW HANOVER; Station 332; 2015-12-10 @ 17:10:52;
2 BRIAR PATH & WHITEMARSH LN; HATFIELD TOWNSHIP; Station 345; 2015-12-10 @ 17:29:21;
3 HAWS AVE; NORRISTOWN; 2015-12-10 @ 14:39:21-Station:STA27;
4 AIRY ST & SWEDE ST; NORRISTOWN; Station 308A; 2015-12-10 @ 16:47:36;
5 CHERRYWOOD CT & DEAD END; LOWER POTTSGROVE; Station 329; 2015-12-10 @ 16:56:52;
6 CANNON AVE & W 9TH ST; LANSDALE; Station 345; 2015-12-10 @ 15:39:04;
zip title timeStamp twp
1 19525 EMS: BACK PAINS/INJURY 2015-12-10 17:10:52 NEW HANOVER
2 19446 EMS: DIABETIC EMERGENCY 2015-12-10 17:29:21 HATFIELD TOWNSHIP
3 19401 Fire: GAS-ODOR/LEAK 2015-12-10 14:39:21 NORRISTOWN
4 19401 EMS: CARDIAC EMERGENCY 2015-12-10 16:47:36 NORRISTOWN
5 NA EMS: DIZZINESS 2015-12-10 16:56:52 LOWER POTTSGROVE
6 19446 EMS: HEAD INJURY 2015-12-10 15:39:04 LANSDALE
addr e
1 REINDEER CT & DEAD END 1
2 BRIAR PATH & WHITEMARSH LN 1
3 HAWS AVE 1
4 AIRY ST & SWEDE ST 1
5 CHERRYWOOD CT & DEAD END 1
6 CANNON AVE & W 9TH ST 1
Below are the variables in the dataset:
Below are the means of latitude and longitude columns.
mean_latitude mean_longitude
1 40.15816 -75.3001
Below are the medians of latitude and longitude columns.
median_latitude median_longitude
1 40.14393 -75.30514
Below are the standard deviations of latitude and longitude columns.
sd_latitude sd_longitude
1 0.2206414 1.672884
Below are the frequencies of zipcodes (zip), emergency sub categories (title) and townships (twp).
zip n
1 1104 1
2 3103 2
3 3366 1
4 7081 1
5 7203 1
6 7726 1
7 8033 1
8 8065 1
9 8077 1
10 8361 4
11 8502 1
12 8628 3
13 8832 4
14 15090 1
15 15301 2
16 17331 4
17 17506 1
18 17507 1
19 17545 3
20 17555 1
21 17566 1
22 17603 2
23 17752 3
24 17810 1
25 17901 1
26 18011 1
27 18036 15
28 18040 1
29 18041 2678
30 18042 1
31 18049 1
32 18051 1
33 18054 2282
34 18056 50
35 18070 453
36 18073 4849
37 18074 2996
38 18076 2028
39 18080 1
40 18092 69
41 18101 7
42 18102 1
43 18103 4
44 18104 2
45 18901 4
46 18902 2
47 18911 1
48 18914 286
49 18915 768
50 18927 5
51 18932 67
52 18936 1684
53 18938 1
54 18940 3
55 18944 16
56 18951 66
57 18958 9
58 18960 183
59 18964 8569
60 18966 227
61 18969 4853
62 18974 1472
63 18976 297
64 19001 10113
65 19002 21070
66 19003 7283
67 19004 8114
68 19006 14794
69 19008 2
70 19009 371
71 19010 8624
72 19012 3941
73 19018 1
74 19020 11
75 19021 4
76 19023 1
77 19025 3475
78 19026 2
79 19027 12288
80 19030 2
81 19031 4196
82 19034 6302
83 19035 3212
84 19038 17318
85 19040 13568
86 19041 3244
87 19044 9869
88 19046 17886
89 19047 1
90 19050 2
91 19053 46
92 19054 1
93 19057 3
94 19063 3
95 19064 1
96 19066 3049
97 19070 1
98 19072 4673
99 19073 1
100 19075 2331
101 19082 15
102 19083 169
103 19085 1832
104 19087 2373
105 19090 17377
106 19095 7118
107 19096 7456
108 19102 1
109 19103 1
110 19104 1
111 19106 6
112 19107 12
113 19111 164
114 19115 36
115 19116 22
116 19118 314
117 19119 11
118 19120 162
119 19121 2
120 19122 1
121 19124 1
122 19126 197
123 19127 4
124 19128 929
125 19129 5
126 19131 955
127 19134 1
128 19135 1
129 19138 27
130 19139 2
131 19140 1
132 19144 4
133 19147 2
134 19150 2024
135 19151 1875
136 19153 2
137 19301 7
138 19310 1
139 19312 2
140 19320 1
141 19333 5
142 19341 5
143 19348 8
144 19355 22
145 19365 1
146 19380 14
147 19382 8
148 19390 3
149 19401 45606
150 19403 34888
151 19404 5
152 19405 3054
153 19406 22464
154 19422 12785
155 19423 25
156 19425 6
157 19426 16436
158 19428 14574
159 19435 277
160 19437 2
161 19438 14425
162 19440 8377
163 19443 4
164 19444 7177
165 19445 1
166 19446 32270
167 19450 1
168 19453 680
169 19454 17661
170 19456 1
171 19457 1
172 19460 3006
173 19462 13264
174 19464 43910
175 19465 3133
176 19468 18939
177 19472 49
178 19473 7412
179 19474 24
180 19475 402
181 19477 60
182 19486 5
183 19490 15
184 19492 639
185 19503 23
186 19504 628
187 19505 158
188 19512 1426
189 19518 486
190 19520 5
191 19525 5997
192 19543 1
193 19545 1
194 19601 13
195 19602 1
196 19604 1
197 19605 2
198 19607 1
199 19609 9
200 19610 8
201 21701 2
202 23005 2
203 36107 1
204 77316 1
205 NA 80199
title n
1 EMS: ABDOMINAL PAINS 9005
2 EMS: ACTIVE SHOOTER 3
3 EMS: ALLERGIC REACTION 2878
4 EMS: ALTERED MENTAL STATUS 10088
5 EMS: AMPUTATION 99
6 EMS: ANIMAL BITE 583
7 EMS: APPLIANCE FIRE 46
8 EMS: ARMED SUBJECT 2
9 EMS: ASSAULT VICTIM 4199
10 EMS: BACK PAINS/INJURY 4880
11 EMS: BARRICADED SUBJECT 2
12 EMS: BOMB DEVICE FOUND 10
13 EMS: BOMB THREAT 2
14 EMS: BUILDING FIRE 1323
15 EMS: BURN VICTIM 272
16 EMS: CARBON MONOXIDE DETECTOR 458
17 EMS: CARDIAC ARREST 5443
18 EMS: CARDIAC EMERGENCY 32332
19 EMS: CHOKING 1220
20 EMS: CVA/STROKE 8277
21 EMS: DEBRIS/FLUIDS ON HIGHWAY 5
22 EMS: DEHYDRATION 1559
23 EMS: DIABETIC EMERGENCY 5742
24 EMS: DISABLED VEHICLE 1
25 EMS: DIZZINESS 5154
26 EMS: DROWNING 32
27 EMS: ELECTRICAL FIRE OUTSIDE 27
28 EMS: ELECTROCUTION 32
29 EMS: ELEVATOR EMERGENCY 29
30 EMS: EMS SPECIAL SERVICE 1448
31 EMS: EYE INJURY 286
32 EMS: FALL VICTIM 34676
33 EMS: FEVER 3364
34 EMS: FIRE ALARM 116
35 EMS: FIRE INVESTIGATION 111
36 EMS: FIRE POLICE NEEDED 5
37 EMS: FIRE SPECIAL SERVICE 274
38 EMS: FRACTURE 4094
39 EMS: GAS-ODOR/LEAK 262
40 EMS: GENERAL WEAKNESS 11867
41 EMS: HAZARDOUS MATERIALS INCIDENT 50
42 EMS: HEAD INJURY 18301
43 EMS: HEAT EXHAUSTION 332
44 EMS: HEMORRHAGING 8256
45 EMS: HIT + RUN 1
46 EMS: INDUSTRIAL ACCIDENT 38
47 EMS: LACERATIONS 2871
48 EMS: MATERNITY 982
49 EMS: MEDICAL ALERT ALARM 10394
50 EMS: NAUSEA/VOMITING 7808
51 EMS: OVERDOSE 8361
52 EMS: PLANE CRASH 6
53 EMS: POISONING 224
54 EMS: POLICE INFORMATION 1
55 EMS: PUBLIC SERVICE 5
56 EMS: RESCUE - ELEVATOR 22
57 EMS: RESCUE - GENERAL 320
58 EMS: RESCUE - TECHNICAL 37
59 EMS: RESCUE - WATER 282
60 EMS: RESPIRATORY EMERGENCY 34248
61 EMS: S/B AT HELICOPTER LANDING 74
62 EMS: SEIZURES 10823
63 EMS: SHOOTING 237
64 EMS: STABBING 213
65 EMS: STANDBY FOR ANOTHER CO 10
66 EMS: SUBJECT IN PAIN 19646
67 EMS: SUICIDE THREAT 2
68 EMS: SUSPICIOUS 5
69 EMS: SYNCOPAL EPISODE 10806
70 EMS: TRAIN CRASH 10
71 EMS: TRANSFERRED CALL 43
72 EMS: TRASH/DUMPSTER FIRE 8
73 EMS: UNCONSCIOUS SUBJECT 8791
74 EMS: UNKNOWN MEDICAL EMERGENCY 10698
75 EMS: UNKNOWN TYPE FIRE 16
76 EMS: UNRESPONSIVE SUBJECT 2798
77 EMS: VEHICLE ACCIDENT 25513
78 EMS: VEHICLE FIRE 226
79 EMS: VEHICLE LEAKING FUEL 1
80 EMS: WARRANT SERVICE 8
81 EMS: WOODS/FIELD FIRE 19
82 Fire: ANIMAL COMPLAINT 1
83 Fire: APPLIANCE FIRE 1217
84 Fire: BARRICADED SUBJECT 1
85 Fire: BUILDING FIRE 4754
86 Fire: BURN VICTIM 227
87 Fire: CARBON MONOXIDE DETECTOR 3990
88 Fire: CARDIAC ARREST 1364
89 Fire: CARDIAC EMERGENCY 7
90 Fire: CVA/STROKE 1
91 Fire: DEBRIS/FLUIDS ON HIGHWAY 256
92 Fire: DIABETIC EMERGENCY 1
93 Fire: DISABLED VEHICLE 7
94 Fire: DIZZINESS 1
95 Fire: ELECTRICAL FIRE OUTSIDE 5111
96 Fire: ELEVATOR EMERGENCY 920
97 Fire: EMS SPECIAL SERVICE 8
98 Fire: FALL VICTIM 7
99 Fire: FIRE ALARM 38336
100 Fire: FIRE INVESTIGATION 9444
101 Fire: FIRE POLICE NEEDED 1587
102 Fire: FIRE SPECIAL SERVICE 4050
103 Fire: FOOT PATROL 1
104 Fire: GAS-ODOR/LEAK 6740
105 Fire: GENERAL WEAKNESS 1
106 Fire: HAZARDOUS MATERIALS INCIDENT 51
107 Fire: HAZARDOUS ROAD CONDITIONS 2
108 Fire: HEAD INJURY 3
109 Fire: HEMORRHAGING 1
110 Fire: MEDICAL ALERT ALARM 5
111 Fire: NAUSEA/VOMITING 2
112 Fire: OVERDOSE 6
113 Fire: PLANE CRASH 5
114 Fire: POISONING 1
115 Fire: POLICE INFORMATION 3
116 Fire: PRISONER IN CUSTODY 1
117 Fire: PUBLIC SERVICE 1
118 Fire: PUMP DETAIL 171
119 Fire: RESCUE - ELEVATOR 736
120 Fire: RESCUE - GENERAL 376
121 Fire: RESCUE - TECHNICAL 49
122 Fire: RESCUE - WATER 295
123 Fire: RESPIRATORY EMERGENCY 2
124 Fire: ROAD OBSTRUCTION 2
125 Fire: S/B AT HELICOPTER LANDING 658
126 Fire: STANDBY FOR ANOTHER CO 12
127 Fire: SUBJECT IN PAIN 4
128 Fire: SUICIDE ATTEMPT 2
129 Fire: SUSPICIOUS 1
130 Fire: SYNCOPAL EPISODE 3
131 Fire: TRAIN CRASH 14
132 Fire: TRANSFERRED CALL 102
133 Fire: TRASH/DUMPSTER FIRE 1190
134 Fire: UNCONSCIOUS SUBJECT 4
135 Fire: UNKNOWN MEDICAL EMERGENCY 2
136 Fire: UNKNOWN TYPE FIRE 1964
137 Fire: UNRESPONSIVE SUBJECT 3
138 Fire: VEHICLE ACCIDENT 10864
139 Fire: VEHICLE FIRE 3232
140 Fire: VEHICLE LEAKING FUEL 337
141 Fire: WOODS/FIELD FIRE 2486
142 Traffic: DEBRIS/FLUIDS ON HIGHWAY - 201
143 Traffic: DISABLED VEHICLE - 47909
144 Traffic: HAZARDOUS ROAD CONDITIONS - 6833
145 Traffic: ROAD OBSTRUCTION - 23235
146 Traffic: VEHICLE ACCIDENT - 148372
147 Traffic: VEHICLE FIRE - 3366
148 Traffic: VEHICLE LEAKING FUEL - 292
twp n
1 293
2 ABINGTON 39947
3 AMBLER 4454
4 BERKS COUNTY 1930
5 BRIDGEPORT 3695
6 BRYN ATHYN 1254
7 BUCKS COUNTY 1982
8 CHELTENHAM 30574
9 CHESTER COUNTY 7362
10 COLLEGEVILLE 2916
11 CONSHOHOCKEN 5655
12 DELAWARE COUNTY 1802
13 DOUGLASS 5550
14 EAST GREENVILLE 1316
15 EAST NORRITON 13963
16 FRANCONIA 9297
17 GREEN LANE 385
18 HATBORO 5448
19 HATFIELD BORO 1370
20 HATFIELD TOWNSHIP 11641
21 HORSHAM 18380
22 JENKINTOWN 4150
23 LANSDALE 11963
24 LEHIGH COUNTY 190
25 LIMERICK 14338
26 LOWER FREDERICK 2081
27 LOWER GWYNEDD 11139
28 LOWER MERION 55490
29 LOWER MORELAND 10988
30 LOWER POTTSGROVE 10775
31 LOWER PROVIDENCE 22476
32 LOWER SALFORD 9218
33 MARLBOROUGH 2144
34 MONTGOMERY 17315
35 NARBERTH 1751
36 NEW HANOVER 5207
37 NORRISTOWN 37633
38 NORTH WALES 2182
39 PENNSBURG 2615
40 PERKIOMEN 3141
41 PHILA COUNTY 267
42 PLYMOUTH 20116
43 POTTSTOWN 27387
44 RED HILL 1987
45 ROCKLEDGE 1569
46 ROYERSFORD 3545
47 SALFORD 1488
48 SCHWENKSVILLE 1337
49 SKIPPACK 5513
50 SOUDERTON 3012
51 SPRINGFIELD 15504
52 TELFORD 3376
53 TOWAMENCIN 11407
54 TRAPPE 1736
55 UPPER DUBLIN 18862
56 UPPER FREDERICK 2357
57 UPPER GWYNEDD 8860
58 UPPER HANOVER 3878
59 UPPER MERION 36010
60 UPPER MORELAND 22932
61 UPPER POTTSGROVE 3557
62 UPPER PROVIDENCE 16122
63 UPPER SALFORD 1913
64 WEST CONSHOHOCKEN 5216
65 WEST NORRITON 11187
66 WEST POTTSGROVE 3103
67 WHITEMARSH 17754
68 WHITPAIN 13480
69 WORCESTER 6037
set.seed(1)
emergency_calls_data[sample(nrow(emergency_calls_data), 50), ] %>%
ggplot(aes(x = twp,
fill = title)) +
geom_bar(position = "stack") +
theme(axis.text.x=element_text(angle=90, size = 5), legend.text=element_text(size=5)) +
labs(x="Township", y="Frequency") +
ggtitle("Plot of township vs frequency grouped by sub-category")
Below is a plot of count of various broad categories of emergencies vs township. The dataset has three major predefined categories - EMS, Traffic and Fire. EMS
includes serious illness or injuries like weakness, head injuries, seizures etc. Traffic
constitutes vehicle accidents, disabled vehicles etc. Fire
includes accidents resulting from any kind of fire in a building or outside.
We can see that fire accidents are relatively lesser than EMS or Traffic related emergencies.
library(stringr)
emergency_calls_data %>%
mutate(emergency_category=word(title, sep = fixed(":"))) %>%
ggplot(aes(x = twp)) +
geom_bar() +
theme(axis.text.x=element_text(angle=90, size = 2.25), legend.text=element_text(size=3)) +
facet_wrap(vars(emergency_category)) +
labs(x="Township", y="Emergency call count") +
ggtitle("Plot of call count vs township for each emergency category")
Below is a barplot of emergency counts vs township grouped by broad emergency categories (EMS, Fire and Traffic)
Lower Merion contributes the highest to the emergency calls in the county followed by Abington and Norristown.
emergency_calls_data %>%
mutate(emergency_category=word(title, sep = fixed(":"))) %>%
ggplot(aes(x = twp,
fill = emergency_category)) +
geom_bar(position = "stack") +
theme(axis.text.x=element_text(angle=90, size = 3), legend.text=element_text(size=5)) +
labs(x="Township", y="Emergency call count") +
ggtitle("Plot of call count vs township grouped by emergency category")
Below is a plot of emergency call count vs year.
Since vehicle accidents contribute the most to emergency calls in Montgomery County, here is a plot of vehicle accidents vs year. Vehicle accidents are high during the period from 2016 to 2019
library(stringr)
calls_data <- mutate(emergency_calls_data, year=format(as.POSIXct(timeStamp, format="%Y-%m-%d %H:%M:%S"), format="%Y"))
ggplot(subset(calls_data, title %in% c("Traffic: VEHICLE ACCIDENT -")), aes(x = year)) +
geom_bar() +
labs(x="Year", y="Vehicle accident calls") +
ggtitle("Plot of vehicle accident calls vs year")
Since vehicle accidents contribute the most to emergency calls in Montgomery County, here is a plot of vehicle accident calls vs township. Lower Merion contributes the most to vehicle accidents followed by Upper Merion.
library(stringr)
calls_data <- mutate(emergency_calls_data, year=format(as.POSIXct(timeStamp, format="%Y-%m-%d %H:%M:%S"), format="%Y"))
ggplot(subset(calls_data, title %in% c("Traffic: VEHICLE ACCIDENT -")), aes(x = twp)) +
geom_bar() +
theme(axis.text.x=element_text(angle=90, size = 3), legend.text=element_text(size=5)) +
labs(x="Townhsip", y="Vehicle accident calls") +
ggtitle("Plot of vehicle accident calls vs township")
Below is a plot of emergency call count vs month. There is no significant difference in emergency counts amongst the various months.
library(stringr)
emergency_calls_data %>%
mutate(month=ordered(month.abb[strtoi(format(as.POSIXct(timeStamp, format="%Y-%m-%d %H:%M:%S"), format="%m"))], levels=c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"))) %>%
filter(!is.na(month)) %>%
ggplot(aes(x = month, na.rm = TRUE)) +
geom_bar() +
labs(x="Month", y="Emergency call count") +
ggtitle("Plot of call count vs month")
Below is a barplot of emergency counts vs year grouped by broad emergency categories (EMS, Fire and Traffic)
emergency_calls_data %>%
mutate(emergency_category=word(title, sep = fixed(":")), year=format(as.POSIXct(timeStamp, format="%Y-%m-%d %H:%M:%S"), format="%Y")) %>%
ggplot(aes(x = year,
fill = emergency_category)) +
geom_bar(position = "stack") +
labs(x="Year", y="Emergency call count") +
ggtitle("Plot of call count vs year grouped by emergency category")
Below is a heatmap of shootings in various tonwships from 2015-2020. This information can give us insights into the crime scene in various townships.
calls_data <- mutate(emergency_calls_data, year=format(as.POSIXct(timeStamp, format="%Y-%m-%d %H:%M:%S"), format="%Y"))
calls_data <- subset(calls_data, title %in% c("EMS: SHOOTING"))
call_count <- count(calls_data, year, twp, .drop=FALSE)
ggplot(call_count, aes(x = year, y = twp,
fill = n)) +
labs(x="Year", y="Township") +
theme(axis.text.y=element_text(size = 3), legend.text=element_text(size=5)) +
ggtitle("Heatmap of shootings") +
geom_tile()
library(stringr)
calls_data <- mutate(emergency_calls_data, day=ordered(weekdays(as.Date(timeStamp)), levels=c("Monday", "Tuesday", "Wednesday", "Thursday",
"Friday", "Saturday", "Sunday")))
calls_data <- filter(calls_data, !is.na(day))
ggplot(subset(calls_data, grepl('Traffic:', title)), aes(x = day, na.rm = TRUE)) +
geom_bar() +
labs(x="Weekday", y="Traffic emergency call count") +
ggtitle("Plot of traffic emergency call count vs weekday")
day_time <- as.POSIXct(strptime(c("000000","040000","114500","170000","193000","235959"),
"%H%M%S"),"UTC")
labels = c("night","morning","afternoon","evening","night")
calls_data <- mutate(emergency_calls_data, time_of_day=ordered(labels[findInterval(as.POSIXct(strptime(format(strptime(timeStamp, format="%Y-%m-%d %H:%M:%S"), format="%H%M%S"), format="%H%M%S"), "UTC"), day_time)], levels=c("morning", "afternoon", "evening", "night")))
calls_data <- filter(calls_data, !is.na(time_of_day))
ggplot(subset(calls_data, grepl('Traffic:', title)), aes(x = time_of_day, na.rm = TRUE)) +
geom_bar() +
labs(x="Time of day", y="Traffic emergency call count") +
ggtitle("Plot of traffic emergency call count vs time of day")
day_time <- as.POSIXct(strptime(c("000000","040000","114500","170000","193000","235959"),
"%H%M%S"),"UTC")
labels = c("night","morning","afternoon","evening","night")
calls_data <- mutate(emergency_calls_data, time_of_day=ordered(labels[findInterval(as.POSIXct(strptime(format(strptime(timeStamp, format="%Y-%m-%d %H:%M:%S"), format="%H%M%S"), format="%H%M%S"), "UTC"), day_time)], levels=c("morning", "afternoon", "evening", "night")))
calls_data <- filter(calls_data, !is.na(time_of_day))
ggplot(subset(calls_data, grepl('EMS: CARDIAC EMERGENCY', title)), aes(x = time_of_day, na.rm = TRUE)) +
geom_bar() +
labs(x="Time of day", y="Cardiac emergency call count") +
ggtitle("Plot of cardiac emergency call count vs time of day")
What’s missing? What will you add now to deadline?
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Murulidhara (2022, Jan. 20). Data Analytics and Computational Social Science: HW6 Brinda Murulidhara. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscombrinda855645/
BibTeX citation
@misc{murulidhara2022hw6, author = {Murulidhara, Brinda}, title = {Data Analytics and Computational Social Science: HW6 Brinda Murulidhara}, url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscombrinda855645/}, year = {2022} }