Brinda Murulidhara HW5

Advanced visualisations

Brinda Murulidhara
2022-01-13

Read in a dataset

I have chosen the Emergency - 911 calls dataset from Kaggle (https://www.kaggle.com/mchirico/montcoalert/version/32) for my final project. The dataset contains emergency 911 calls in Montgomery County, Pennsylvania from 2015 to 2020. Below is the code snippet to read and preview the data.

library(dplyr)
emergency_calls_data <- read.csv("911.csv")
head(emergency_calls_data)
       lat       lng
1 40.29788 -75.58129
2 40.25806 -75.26468
3 40.12118 -75.35198
4 40.11615 -75.34351
5 40.25149 -75.60335
6 40.25347 -75.28324
                                                                                 desc
1           REINDEER CT & DEAD END;  NEW HANOVER; Station 332; 2015-12-10 @ 17:10:52;
2 BRIAR PATH & WHITEMARSH LN;  HATFIELD TOWNSHIP; Station 345; 2015-12-10 @ 17:29:21;
3                          HAWS AVE; NORRISTOWN; 2015-12-10 @ 14:39:21-Station:STA27;
4               AIRY ST & SWEDE ST;  NORRISTOWN; Station 308A; 2015-12-10 @ 16:47:36;
5    CHERRYWOOD CT & DEAD END;  LOWER POTTSGROVE; Station 329; 2015-12-10 @ 16:56:52;
6               CANNON AVE & W 9TH ST;  LANSDALE; Station 345; 2015-12-10 @ 15:39:04;
    zip                   title           timeStamp               twp
1 19525  EMS: BACK PAINS/INJURY 2015-12-10 17:10:52       NEW HANOVER
2 19446 EMS: DIABETIC EMERGENCY 2015-12-10 17:29:21 HATFIELD TOWNSHIP
3 19401     Fire: GAS-ODOR/LEAK 2015-12-10 14:39:21        NORRISTOWN
4 19401  EMS: CARDIAC EMERGENCY 2015-12-10 16:47:36        NORRISTOWN
5    NA          EMS: DIZZINESS 2015-12-10 16:56:52  LOWER POTTSGROVE
6 19446        EMS: HEAD INJURY 2015-12-10 15:39:04          LANSDALE
                        addr e
1     REINDEER CT & DEAD END 1
2 BRIAR PATH & WHITEMARSH LN 1
3                   HAWS AVE 1
4         AIRY ST & SWEDE ST 1
5   CHERRYWOOD CT & DEAD END 1
6      CANNON AVE & W 9TH ST 1

Standard error bars

Below is a plot of mean longitude vs township (twp). Note that the values are negative and hence the error bars appear at the bottom. The cyan coloring in the graph is to demarcate the bars.

library(ggplot2)
emergency_calls_data %>%
  group_by(twp) %>%
  summarise(mean_longitude=mean(lng), sd=sd(lng))%>%
  ggplot(aes(x = twp, y = mean_longitude)) +
  geom_bar(stat="identity", colour='cyan') + 
  theme(axis.text.x=element_text(angle=90, size = 3)) +
  geom_errorbar(aes(ymin=mean_longitude-sd, ymax=mean_longitude+sd), width=.2)

Faceted plot

Below is a plot of count of various broad categories of emergencies vs township. The dataset has three major predefined categories - EMS, Traffic and Fire. EMS includes serious illness or injuries like weakness, head injuries, seizures etc. Traffic constitutes vehicle accidents, disabled vehicles etc. Fire includes accidents resulting from any kind of fire in a building or outside.

library(stringr) 
emergency_calls_data  %>%
mutate(emergency_category=word(title, sep = fixed(":"))) %>%
ggplot(aes(x = twp)) + 
  geom_bar() + 
  theme(axis.text.x=element_text(angle=90, size = 2.25), legend.text=element_text(size=3)) +
  facet_wrap(vars(emergency_category)) + 
  labs(x="Township", y="Emergency call count") +
  ggtitle("Plot of call count vs township for each emergency category")

Colored barplot

Below is a barplot of emergency counts vs township grouped by broad emergency categories (EMS, Fire and Traffic)

emergency_calls_data  %>%
mutate(emergency_category=word(title, sep = fixed(":"))) %>%
ggplot(aes(x = twp, 
           fill = emergency_category)) + 
  geom_bar(position = "stack") + 
  theme(axis.text.x=element_text(angle=90, size = 3), legend.text=element_text(size=5)) +
  labs(x="Township", y="Emergency call count") + 
  ggtitle("Plot of call count vs township grouped by emergency category")

Questions

What is missing (if anything) in your analysis process so far?

These visualizations do not aid in comparing the trends in emergency calls across years or months.

What conclusions can you make about your research questions at this point?

The visualizations help us understand what kind of emergencies occur more frequently in Montgomery County as a whole as well as in every township. We can compare the number and type of emergencies that occur across townships. The emergency response team can make better preparations with this knowledge and can focus its efforts more on townships that have higher counts of serious emergencies.

What do you think a naive reader would need to fully understand your graphs?

Is there anything you want to answer with your dataset, but can’t?

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Murulidhara (2022, Jan. 14). Data Analytics and Computational Social Science: Brinda Murulidhara HW5. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscombrinda854205/

BibTeX citation

@misc{murulidhara2022brinda,
  author = {Murulidhara, Brinda},
  title = {Data Analytics and Computational Social Science: Brinda Murulidhara HW5},
  url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscombrinda854205/},
  year = {2022}
}