library(ggplot2)
library(dplyr)
library(lubridate)
library(leaflet)
library(tidyverse)
library(gridExtra)
::opts_chunk$set(echo = TRUE, warning=FALSE, message=FALSE) knitr
Final Project: Sai Pranav Kurly
Background:
In today’s society, analyzing crime data is critical because it serves several important purposes. For starters, it enables efficient resource allocation by identifying crime hotspots and patterns. Law enforcement agencies can strategically deploy personnel and resources to high-crime areas to maximize their impact on public safety. Second, the analysis of crime data allows for proactive crime prevention efforts. Authorities can develop targeted interventions and implement preventive measures to deter criminals and reduce crime rates by understanding the underlying factors and dynamics of criminal activities. This data-driven approach enables law enforcement to be more proactive and preventative rather than reactive.Furthermore, crime data analysis is critical in identifying and apprehending criminals. It aids in the creation of offender profiles, the linking of seemingly unrelated crimes, and the narrowing down of suspects, resulting in more successful investigations and arrests. Furthermore, analyzing crime data helps to inform policy and decision-making processes. Policymakers can learn about emerging crime trends, assess the effectiveness of current policies, and develop evidence-based strategies to address specific crime issues. Governments and organizations can create more targeted and impactful policies that improve public safety and security by leveraging data-driven insights. Finally, analyzing crime data contributes to overall public safety improvement. It raises community awareness of potential risks, empowers individuals to take necessary precautions, and allows law enforcement agencies to respond to incidents more effectively.Emergency response times can be improved, and individuals’ safety and well-being can be better protected, by leveraging crime data.
Dataset(s) Introduction:
The Boston Crime Dataset, also known as the Boston Crime Incident Reports, is a dataset that contains information about reported incidents of crime in the city of Boston, Massachusetts, USA. It provides a detailed record of criminal activities and incidents that have occurred within the city. The dataset includes various attributes related to each reported crime, such as the type of offense, location, date and time of occurrence, and other relevant details. The information is collected and maintained by the Boston Police Department, which aims to promote transparency and public awareness regarding crime trends and patterns in the city. Researchers, analysts, and data enthusiasts often utilize the Boston Crime Dataset to study crime patterns, develop predictive models, and gain insights into criminal activities within the city. It can be used for various purposes, such as identifying high-crime areas, evaluating the effectiveness of law enforcement strategies, or understanding the impact of crime on different neighborhoods.
The Dataset contains the following columns and below are the descriptions:
- Incident Number: Internal report number for each incident, non-null value.
- Offense Code: Numerical code representing the offense description.
- Offense Code Group: High-level group name for the offense code.
- Offense Description: Detailed description and internal categorization of the offense.
- District: District where the crime occurred.
- Reporting Area: Number of the reporting area where the crime occurred.
- Shooting: Numerical value indicating if a shooting took place.
- Occurred on Date: Date and time of when the crime occurred.
- Year: Year when the crime occurred.
- Month: Month when the crime occurred.
- Day of Week: Day of the week when the crime occurred.
- Hour: Hour when the crime occurred.
- UCR Part: Universal Crime Reporting Part Number.
- Street: Street name where the crime occurred.
- Lat: Latitude coordinate of the crime location.
- Long: Longitude coordinate of the crime location.
- Location - Gives the location of where the crime has taken place.
I’ve also used another dataset i.e. the Offense_code Dataset which helps map the offense name since the original Dataset did not contain this and was NA. - CODE: Numerical code representing the offense description. - Name: High-level group name for the offense code. The Dataset consists of crimes from 2019 to 2022
Read Data and brifly describing
I want the latest data which can only be found on the Boston PD website, hence I am combining all the data that I downloaded from the website first.
<- "SaipranavKurly_FinalProjectData/"
folder_path <- list.files(folder_path)
file_list <- sort(file_list)
file_list <- data.frame()
combined_data for (file_name in file_list) {
if(file_name != 'Offense_Codes.csv' & file_name != 'Combined_Dataset.csv'){
<- file.path(folder_path, file_name)
file_path <- read.csv(file_path)
file_data <- rbind(combined_data, file_data)
combined_data
}
}<- file.path(folder_path, "Combined_Dataset.csv")
combined_file_path write.csv(combined_data, combined_file_path, row.names = FALSE)
Reading the Dataset and merging:
<- read.csv("SaipranavKurly_FinalProjectData/Combined_Dataset.csv")
crime_dataset <- read.csv("SaipranavKurly_FinalProjectData/Offense_Codes.csv")
offence_codes_dataset names(offence_codes_dataset) <- c("OFFENSE_CODE", "OFFENCE_NAME")
<- merge(crime_dataset, offence_codes_dataset, by = "OFFENSE_CODE", all.x = TRUE) crime_dataset
dim(crime_dataset)
[1] 531744 18
length(unique(crime_dataset))
[1] 18
head(crime_dataset)
Tidy the data
To clean the dataset, I am removing all the rows where the OFFENCE_NAMES are NA. Additionally there are a few categories which I feel are not of much use to analyze the crimes in boston and I have removed them as well. I also plan to mutate and add additional columns like months from the date column during the preprocessing step so that it will make it easier to plot graphs.
Cleaning the dataset:
<- crime_dataset[crime_dataset$OFFENCE_NAME != "INVESTIGATE PERSON", ]
crime_dataset <- crime_dataset[crime_dataset$OFFENCE_NAME != "INVESTIGATE PROPERTY", ]
crime_dataset <- crime_dataset[, !(names(crime_dataset) == "OFFENSE_CODE_GROUP")]
crime_dataset <- crime_dataset[complete.cases(crime_dataset$OFFENCE_NAME), ] crime_dataset
Plan for Visualization
I am planning to to analyze the following using the dataset:
- Crime Distribution and Frequency:
Create a bar chart to depict the distribution of crime types in Boston.
Additionally, I am also going to determine the most and least common crime categories.
Finally, I plan to also visualize which streets have the highest crime.
- Temporal Patterns and Trends:
Using line graphs or time series plots, plot the number of reported crimes over time.
Identify any notable trends or patterns in crime rates over time. For example, at what hour do mouse crimes happen at.
- Seasonal Variation in Crime:
Aggregate the data by month or season to see if there are seasonal variations in crime rates.
To compare the distribution of crimes across seasons, create box plots or violin plots.
- Geographic Crime Hotspots:
Identify high-crime areas in Boston using geospatial visualization techniques.
To visualize crime density, plot crime incidents on a map with markers or heatmaps.
The distribution of crime types can be represented visually using bar charts or pie charts. They give a clear overview of the most common and least common crime categories, making it simple to identify major crime trends. Line graphs and time series plots are useful for examining how crime rates change over time. In crime data, they reveal trends, patterns, and cyclical behavior. These visualizations aid in the identification of long-term trends, seasonal patterns, and unexpected changes in crime rates.Box plots and violin plots allow for the comparison of crime rates across seasons. They provide insights into the distribution of crime incidents during specific periods and aid in determining whether there are significant seasonal differences in crime rates.
Descriptive Statistics
The Boston Crime dataset contains descriptive information about various criminal incidents reported in the city of Boston. The dataset contains a detailed record of crimes, including details such as the type of offense, location, date, and time of occurrence. It includes, but is not limited to, assaults, robberies, burglaries, larcenies, drug-related offenses, and homicides. The dataset includes attributes that provide additional context about each crime, such as the district or neighborhood where the incident occurred, the reporting area, and the street where the crime occurred. It may also contain information about the incident’s outcome, such as arrests or charges filed.Temporal data, such as the weekday and month, is frequently included, allowing for the analysis of crime patterns and trends over time. Researchers and analysts can use this temporal granularity to investigate correlations between crime and factors such as seasonality, day of the week, or time of day. Furthermore, the dataset may include geographic coordinates (latitude and longitude) or other spatial information that enables mapping and spatial analysis of crime incidents. This spatial data makes it easier to identify high-crime areas, investigate spatial clusters, and assess the spatial distribution of criminal activities throughout the city.
summary(crime_dataset)
OFFENSE_CODE INCIDENT_NUMBER OFFENSE_DESCRIPTION DISTRICT
Min. : 111 Length:440108 Length:440108 Length:440108
1st Qu.: 724 Class :character Class :character Class :character
Median :2619 Mode :character Mode :character Mode :character
Mean :2086
3rd Qu.:3201
Max. :3831
REPORTING_AREA SHOOTING OCCURRED_ON_DATE YEAR
Min. : 0.0 Min. :0.000000 Length:440108 Min. :2019
1st Qu.:167.0 1st Qu.:0.000000 Class :character 1st Qu.:2019
Median :338.0 Median :0.000000 Mode :character Median :2020
Mean :372.3 Mean :0.008884 Mean :2020
3rd Qu.:520.0 3rd Qu.:0.000000 3rd Qu.:2021
Max. :962.0 Max. :1.000000 Max. :2022
NA's :106223
MONTH DAY_OF_WEEK HOUR UCR_PART
Min. : 1.000 Length:440108 Min. : 0.0 Mode:logical
1st Qu.: 4.000 Class :character 1st Qu.: 9.0 NA's:440108
Median : 7.000 Mode :character Median :13.0
Mean : 6.516 Mean :12.8
3rd Qu.: 9.000 3rd Qu.:18.0
Max. :12.000 Max. :23.0
STREET Lat Long Location
Length:440108 Min. : 0.00 Min. :-71.35 Length:440108
Class :character 1st Qu.:42.30 1st Qu.:-71.10 Class :character
Mode :character Median :42.33 Median :-71.08 Mode :character
Mean :42.32 Mean :-71.08
3rd Qu.:42.35 3rd Qu.:-71.06
Max. :42.46 Max. : 0.00
NA's :17697 NA's :17697
OFFENCE_NAME
Length:440108
Class :character
Mode :character
unique(crime_dataset$OFFENCE_NAME)
[1] "MURDER NON-NEGLIGIENT MANSLAUGHTER"
[2] "MURDER, NON-NEGLIGIENT MANSLAUGHTER"
[3] "MANSLAUGHTER - VEHICLE - NEGLIGENCE"
[4] "MANSLAUGHTER - TRAIN ETC. VICTIM NON-NEGLIGENCE"
[5] "ROBBERY - FIREARM - BANK"
[6] "ROBBERY - STREET"
[7] "ROBBERY - COMMERCIAL"
[8] "ROBBERY - KNIFE - CHAIN STORE"
[9] "ROBBERY ATTEMPT - KNIFE - CHAIN STORE"
[10] "ROBBERY - BANK"
[11] "ROBBERY - OTHER"
[12] "ROBBERY ATTEMPT - OTHER WEAPON - MISCELLANEOUS"
[13] "ROBBERY - HOME INVASION"
[14] "ROBBERY - CAR JACKING"
[15] "ASSAULT D/W - OTHER"
[16] "ASSAULT - AGGRAVATED - BATTERY"
[17] "ASSAULT & BATTERY D/W - OTHER ON POLICE OFFICER"
[18] "ASSAULT - AGGRAVATED"
[19] "BURGLARY - RESIDENTIAL - FORCE"
[20] "B&E RESIDENCE DAY - FORCE"
[21] "BURGLARY - RESIDENTIAL - ATTEMPT"
[22] "B&E RESIDENCE DAY - ATTEMPT FORCE"
[23] "BURGLARY - RESIDENTIAL - NO FORCE"
[24] "B&E RESIDENCE DAY - NO FORCE"
[25] "B&E NON-RESIDENCE NIGHT - FORCE"
[26] "B&E NON-RESIDENCE NIGHT - ATTEMPT FORCE"
[27] "B&E NON-RESIDENCE DAY - FORCIBLE"
[28] "BURGLARY - COMMERICAL - FORCE"
[29] "BURGLARY - COMMERICAL - ATTEMPT"
[30] "B&E NON-RESIDENCE DAY - ATTEMPT FORCE"
[31] "BURGLARY - COMMERICAL - NO FORCE"
[32] "B&E NON-RESIDENCE DAY - NO FORCE"
[33] "BURGLARY - OTHER - FORCE"
[34] "BURGLARY - OTHER - ATTEMPT"
[35] "BURGLARY - OTHER - NO FORCE"
[36] "LARCENY PICK-POCKET"
[37] "LARCENY PICK-POCKET $200 & OVER"
[38] "LARCENY PURSE SNATCH INCL.NO FORCE $200 & OVER"
[39] "LARCENY PURSE SNATCH - NO FORCE "
[40] "LARCENY SHOPLIFTING $200 & OVER"
[41] "LARCENY SHOPLIFTING"
[42] "LARCENY THEFT FROM MV - NON-ACCESSORY"
[43] "LARCENY NON-ACCESSORY FROM VEH. $200 & OVER"
[44] "LARCENY VEH. ACCESSORY $200 & OVER"
[45] "LARCENY THEFT OF MV PARTS & ACCESSORIES"
[46] "LARCENY BICYCLE $200 & OVER"
[47] "LARCENY THEFT OF BICYCLE"
[48] "LARCENY IN A BUILDING $200 & OVER"
[49] "LARCENY THEFT FROM BUILDING"
[50] "LARCENY FROM COIN MACHINE $200 AND OVER"
[51] "LARCENY THEFT FROM COIN-OP MACHINE"
[52] "LARCENY ALL OTHERS"
[53] "LARCENY OTHER $200 & OVER"
[54] "RECOVERED STOLEN PLATE"
[55] "AUTO THEFT - MOTORCYCLE"
[56] "AUTO THEFT - MOTORCYCLE / SCOOTER"
[57] "AUTO THEFT"
[58] "AUTO THEFT - LEASED/RENTED VEHICLE"
[59] "AUTO THEFT LEASE/RENT VEHICLE"
[60] "RECOVERED - MV RECOVERED IN BOSTON (STOLEN OUTSIDE BOSTON)"
[61] "AUTO THEFT - OUTSIDE - RECOVERED IN BOSTON"
[62] "ASSAULT - SIMPLE"
[63] "SIMPLE ASSAULT"
[64] "ASSAULT & BATTERY"
[65] "ASSAULT SIMPLE - BATTERY"
[66] "ARSON"
[67] "COUNTERFEITING"
[68] "FORGERY / COUNTERFEITING"
[69] "FRAUD - FALSE PRETENSE"
[70] "FRAUD - FALSE PRETENSE / SCHEME"
[71] "FRAUD - CREDIT CARD / ATM FRAUD"
[72] "FRAUD - LARCENY BY SCHEME"
[73] "FRAUD - IMPERSONATION"
[74] "FRAUD - WELFARE"
[75] "FRAUD - WIRE"
[76] "EMBEZZLEMENT"
[77] "STOLEN PROPERTY - BUYING / RECEIVING / POSSESSING"
[78] "PROPERTY - STOLEN THEN RECOVERED"
[79] "VANDALISM"
[80] "VANDALISM - GRAFFITI"
[81] "GRAFFITI"
[82] "WEAPON - FIREARM - CARRYING / POSSESSING, ETC"
[83] "FIREARM/WEAPON - CARRY - SELL - RENT"
[84] "WEAPON - FIREARM - SALE / TRAFFICKING"
[85] "FIREARM/WEAPON - VIOLATION"
[86] "WEAPON - OTHER - CARRYING / POSSESSING, ETC"
[87] "FIREARM/WEAPON - POSSESSION OF DANGEROUS"
[88] "WEAPON - OTHER - OTHER VIOLATION"
[89] "WEAPON - FIREARM - OTHER VIOLATION"
[90] "PROSTITUTION"
[91] "PROSTITUTION - SOLICITING"
[92] "PROSTITUTION - ASSISTING OR PROMOTING"
[93] "PROSTITUTE - DERIVING SUPPORT"
[94] "DRUGS - CLASS A TRAFFICKING OVER 18 GRAMS"
[95] "DRUGS - CLASS B TRAFFICKING OVER 18 GRAMS"
[96] "DRUGS - CLASS D TRAFFICKING OVER 50 GRAMS"
[97] "DRUGS - SALE / MANUFACTURING"
[98] "DRUGS - POSSESSION"
[99] "DRUGS - POSSESSION OF DRUG PARAPHANALIA"
[100] "DRUGS - SICK ASSIST - HEROIN"
[101] "DRUGS - SICK ASSIST - OTHER NARCOTIC"
[102] "DRUGS - SICK ASSIST - OTHER HARMFUL DRUG"
[103] "DRUGS - POSS CLASS A - INTENT TO MFR DIST DISP"
[104] "DRUGS - POSS CLASS A - HEROIN, ETC. "
[105] "DRUGS - POSS CLASS A - HEROIN, ETC."
[106] "DRUGS - POSS CLASS B - INTENT TO MFR DIST DISP"
[107] "DRUGS - PRESENT AT HEROIN"
[108] "DRUGS - POSS CLASS C"
[109] "DRUGS - POSS CLASS D"
[110] "DRUGS - POSS CLASS E"
[111] "DRUGS - POSS CLASS C - INTENT TO MFR DIST DISP"
[112] "DRUGS - TRAFFICKING IN COCAINE"
[113] "DRUGS - POSS CLASS D - INTENT TO MFR DIST DISP"
[114] "DRUGS - POSS CLASS B - COCAINE, ETC."
[115] "DRUGS - POSS CLASS E - INTENT TO MFR DIST DISP"
[116] "DRUGS - CONSP TO VIOL CONTROLLED SUBSTANCE"
[117] "DRUGS - CONSP TO VIOL CONT SUB ACT"
[118] "DRUGS - OTHER"
[119] "VIOL. OF RESTRAINING ORDER W ARREST"
[120] "VIOLATION - RESTRAINING ORDER"
[121] "VIOL. OF RESTRAINING ORDER W NO ARREST"
[122] "HOME INVASION"
[123] "OPERATING UNDER THE INFLUENCE ALCOHOL"
[124] "OPERATING UNDER INFLUENCE - ALCOHOL"
[125] "OPERATING UNDER INFLUENCE - DRUGS"
[126] "OPERATING UNDER THE INFLUENCE DRUGS"
[127] "LIQUOR LAW VIOLATION"
[128] "LIQUOR - VIOLATION "
[129] "AFFRAY"
[130] "DISTURBING THE PEACE"
[131] "DISORDERLY PERSON"
[132] "DISORDERLY CONDUCT"
[133] "ANNOYING AND ACCOSTIN"
[134] "ANNOYING AND ACCOSTING"
[135] "KIDNAPPING - ENTICING OR ATTEMPTED"
[136] "EXTORTION OR BLACKMAIL"
[137] "CHINS"
[138] "TRESPASSING"
[139] "FIRE REPORT/ALARM - FALSE"
[140] "ANIMAL ABUSE"
[141] "POSSESSION OF BURGLARIOUS TOOLS"
[142] "CONSPIRACY EXCEPT DRUG LAW"
[143] "EXPLOSIVES - POSSESSION OR USE"
[144] "FUGITIVE FROM JUSTICE"
[145] "KIDNAPPING/CUSTODIAL KIDNAPPING"
[146] "KIDNAPPING - FORCE"
[147] "OBSCENE PHONE CALLS"
[148] "HARASSMENT"
[149] "PROPERTY - CONCEALING LEASED"
[150] "EVADING FARE"
[151] "PRISONER ESCAPE / ESCAPE & RECAPTURE"
[152] "PRISONER - ESCAPE"
[153] "VIOLATION - HAWKER AND PEDDLER"
[154] "TRUANCY / RUNAWAY"
[155] "TRUANCY"
[156] "LIQUOR - DRINKING IN PUBLIC"
[157] "THREATS TO DO BODILY HARM"
[158] "BOMB THREAT"
[159] "VIOLATION - CITY ORDINANCE"
[160] "OTHER OFFENSE"
[161] "BALLISTICS EVIDENCE/FOUND"
[162] "CRIMINAL HARASSMENT"
[163] "CRIMINAL HARRASSMENT"
[164] "BIOLOGICAL THREATS"
[165] "VAL - VIOLATION OF AUTO LAW - OTHER"
[166] "VAL - OPERATING WITHOUT LICENSE"
[167] "VAL - OPERATING UNREG/UNINS CAR"
[168] "VAL - OPERATING UNREG/UNINS CAR"
[169] "VAL - OPERATING AFTER REV/SUSP."
[170] "M/V - LEAVING SCENE - PROPERTY DAMAGE"
[171] "VAL - OPERATING W/O AUTHORIZATION LAWFUL"
[172] "DEATH INVESTIGATION"
[173] "ANIMAL CONTROL - DOG BITES - ETC."
[174] "INJURY BICYCLE NO M/V INVOLVED"
[175] "SICK/INJURED/MEDICAL - PERSON"
[176] "SUDDEN DEATH"
[177] "SUICIDE"
[178] "SUICIDE / SUICIDE ATTEMPT"
[179] "FIREARM/WEAPON - ACCIDENTAL INJURY / DEATH"
[180] "FIREARM/WEAPON - ACCIDENTAL INJURY"
[181] "SICK/INJURED/MEDICAL - POLICE"
[182] "PRISONER - SUICIDE / SUICIDE ATTEMPT"
[183] "PRISONER - SUICIDE ATTEMPT"
[184] "INVESTIGATION FOR ANOTHER AGENCY"
[185] "PROPERTY - ACCIDENTAL DAMAGE"
[186] "FIRE REPORT - HOUSE, BUILDING, ETC."
[187] "FIRE REPORT - HOUSE, BUILDING, ETC. "
[188] "SERVICE TO OTHER PD INSIDE OF MA."
[189] "SERVICE TO OTHER PD OUTSIDE OF MA."
[190] "LICENSE PREMISE VIOLATION"
[191] "LANDLORD - TENANT SERVICE"
[192] "HARBOR INCIDENTS"
[193] "HARBOR INCIDENT / VIOLATION"
[194] "FIREARM/WEAPON - FOUND OR CONFISCATED"
[195] "AIRCRAFT INCIDENTS"
[196] "EXPLOSIVES - TURNED IN OR FOUND"
[197] "WARRANT ARREST"
[198] "SEARCH WARRANT"
[199] "FIRE REPORT - CAR, BRUSH, ETC."
[200] "FIRE REPORT - CAR, BRUSH, ETC"
[201] "INTIMIDATING WITNESS"
[202] "PROPERTY - LOST"
[203] "PROPERTY - LOST THEN LOCATED"
[204] "FIREARM/WEAPON - LOST"
[205] "FIREARM/WEAPON - LOST "
[206] "M/V PLATES - LOST"
[207] "PROPERTY - FOUND"
[208] "PROPERTY - MISSING"
[209] "VERBAL DISPUTE"
[210] "NOISY PARTY/RADIO/ETC."
[211] "NOISY PARTY/RADIO-ARREST"
[212] "NOISY PARTY/RADIO-NO ARREST"
[213] "DEMONSTRATIONS / RIOT"
[214] "DEMONSTRATIONS/RIOT"
[215] "ANIMAL INCIDENTS"
[216] "SAFEKEEPING"
[217] "PROTECTIVE CUSTODY / SAFEKEEPING"
[218] "TOWED MOTOR VEHICLE"
[219] "MISSING PERSON"
[220] "MISSING PERSON - LOCATED"
[221] "MISSING PERSON - NOT REPORTED - LOCATED"
[222] "REPORT AFFECTING OTHER DEPTS."
[223] "DANGEROUS OR HAZARDOUS CONDITION"
[224] "M/V ACCIDENT - OTHER"
[225] "M/V ACCIDENT - PROPERTY DAMAGE"
[226] "M/V ACCIDENT - PERSONAL INJURY"
[227] "M/V ACCIDENT - POLICE VEHICLE"
[228] "M/V ACCIDENT - OTHER CITY VEHICLE"
[229] "M/V ACCIDENT - INVOLVING BICYCLE - INJURY"
[230] "M/V ACCIDENT - INVOLVING BICYCLE - NO INJURY"
[231] "M/V ACCIDENT INVOLVING PEDESTRIAN - INJURY"
[232] "M/V ACCIDENT - INVOLVING PEDESTRIAN - NO INJURY"
[233] "M/V - LEAVING SCENE - PERSONAL INJURY"
table(crime_dataset$OFFENCE_NAME)
AFFRAY
210
AIRCRAFT INCIDENTS
172
ANIMAL ABUSE
132
ANIMAL CONTROL - DOG BITES - ETC.
188
ANIMAL INCIDENTS
1192
ANNOYING AND ACCOSTIN
13
ANNOYING AND ACCOSTING
13
ARSON
194
ASSAULT - AGGRAVATED
6045
ASSAULT - AGGRAVATED - BATTERY
1128
ASSAULT - SIMPLE
9553
ASSAULT & BATTERY
3900
ASSAULT & BATTERY D/W - OTHER ON POLICE OFFICER
6045
ASSAULT D/W - OTHER
1128
ASSAULT SIMPLE - BATTERY
3900
AUTO THEFT
7542
AUTO THEFT - LEASED/RENTED VEHICLE
413
AUTO THEFT - MOTORCYCLE
985
AUTO THEFT - MOTORCYCLE / SCOOTER
985
AUTO THEFT - OUTSIDE - RECOVERED IN BOSTON
912
AUTO THEFT LEASE/RENT VEHICLE
413
B&E NON-RESIDENCE DAY - ATTEMPT FORCE
25
B&E NON-RESIDENCE DAY - FORCIBLE
1420
B&E NON-RESIDENCE DAY - NO FORCE
51
B&E NON-RESIDENCE NIGHT - ATTEMPT FORCE
74
B&E NON-RESIDENCE NIGHT - FORCE
13
B&E RESIDENCE DAY - ATTEMPT FORCE
131
B&E RESIDENCE DAY - FORCE
3068
B&E RESIDENCE DAY - NO FORCE
378
BALLISTICS EVIDENCE/FOUND
2120
BIOLOGICAL THREATS
2
BOMB THREAT
148
BURGLARY - COMMERICAL - ATTEMPT
25
BURGLARY - COMMERICAL - FORCE
1420
BURGLARY - COMMERICAL - NO FORCE
51
BURGLARY - OTHER - ATTEMPT
5
BURGLARY - OTHER - FORCE
27
BURGLARY - OTHER - NO FORCE
30
BURGLARY - RESIDENTIAL - ATTEMPT
131
BURGLARY - RESIDENTIAL - FORCE
3068
BURGLARY - RESIDENTIAL - NO FORCE
378
CHINS
81
CONSPIRACY EXCEPT DRUG LAW
6
COUNTERFEITING
1057
CRIMINAL HARASSMENT
3709
CRIMINAL HARRASSMENT
3709
DANGEROUS OR HAZARDOUS CONDITION
540
DEATH INVESTIGATION
1562
DEMONSTRATIONS / RIOT
63
DEMONSTRATIONS/RIOT
63
DISORDERLY CONDUCT
119
DISORDERLY PERSON
119
DISTURBING THE PEACE
1256
DRUGS - CLASS A TRAFFICKING OVER 18 GRAMS
58
DRUGS - CLASS B TRAFFICKING OVER 18 GRAMS
40
DRUGS - CLASS D TRAFFICKING OVER 50 GRAMS
5
DRUGS - CONSP TO VIOL CONT SUB ACT
11
DRUGS - CONSP TO VIOL CONTROLLED SUBSTANCE
11
DRUGS - OTHER
268
DRUGS - POSS CLASS A - HEROIN, ETC.
243
DRUGS - POSS CLASS A - HEROIN, ETC.
243
DRUGS - POSS CLASS A - INTENT TO MFR DIST DISP
406
DRUGS - POSS CLASS B - COCAINE, ETC.
766
DRUGS - POSS CLASS B - INTENT TO MFR DIST DISP
658
DRUGS - POSS CLASS C
58
DRUGS - POSS CLASS C - INTENT TO MFR DIST DISP
38
DRUGS - POSS CLASS D
50
DRUGS - POSS CLASS D - INTENT TO MFR DIST DISP
145
DRUGS - POSS CLASS E
17
DRUGS - POSS CLASS E - INTENT TO MFR DIST DISP
5
DRUGS - POSSESSION
3
DRUGS - POSSESSION OF DRUG PARAPHANALIA
92
DRUGS - PRESENT AT HEROIN
513
DRUGS - SALE / MANUFACTURING
5489
DRUGS - SICK ASSIST - HEROIN
171
DRUGS - SICK ASSIST - OTHER HARMFUL DRUG
2403
DRUGS - SICK ASSIST - OTHER NARCOTIC
5509
DRUGS - TRAFFICKING IN COCAINE
38
EMBEZZLEMENT
428
EVADING FARE
390
EXPLOSIVES - POSSESSION OR USE
62
EXPLOSIVES - TURNED IN OR FOUND
46
EXTORTION OR BLACKMAIL
724
FIRE REPORT - CAR, BRUSH, ETC
85
FIRE REPORT - CAR, BRUSH, ETC.
85
FIRE REPORT - HOUSE, BUILDING, ETC.
1697
FIRE REPORT - HOUSE, BUILDING, ETC.
1697
FIRE REPORT/ALARM - FALSE
150
FIREARM/WEAPON - ACCIDENTAL INJURY
7
FIREARM/WEAPON - ACCIDENTAL INJURY / DEATH
7
FIREARM/WEAPON - CARRY - SELL - RENT
1181
FIREARM/WEAPON - FOUND OR CONFISCATED
1300
FIREARM/WEAPON - LOST
18
FIREARM/WEAPON - LOST
18
FIREARM/WEAPON - POSSESSION OF DANGEROUS
107
FIREARM/WEAPON - VIOLATION
1
FORGERY / COUNTERFEITING
1057
FRAUD - CREDIT CARD / ATM FRAUD
2104
FRAUD - FALSE PRETENSE
5066
FRAUD - FALSE PRETENSE / SCHEME
5066
FRAUD - IMPERSONATION
2140
FRAUD - LARCENY BY SCHEME
2140
FRAUD - WELFARE
1141
FRAUD - WIRE
709
FUGITIVE FROM JUSTICE
302
GRAFFITI
392
HARASSMENT
1176
HARBOR INCIDENT / VIOLATION
229
HARBOR INCIDENTS
229
HOME INVASION
2
INJURY BICYCLE NO M/V INVOLVED
114
INTIMIDATING WITNESS
198
INVESTIGATION FOR ANOTHER AGENCY
52
KIDNAPPING - ENTICING OR ATTEMPTED
11
KIDNAPPING - FORCE
35
KIDNAPPING/CUSTODIAL KIDNAPPING
35
LANDLORD - TENANT SERVICE
2602
LARCENY ALL OTHERS
6259
LARCENY BICYCLE $200 & OVER
3041
LARCENY FROM COIN MACHINE $200 AND OVER
14
LARCENY IN A BUILDING $200 & OVER
7431
LARCENY NON-ACCESSORY FROM VEH. $200 & OVER
8720
LARCENY OTHER $200 & OVER
6259
LARCENY PICK-POCKET
256
LARCENY PICK-POCKET $200 & OVER
256
LARCENY PURSE SNATCH - NO FORCE
84
LARCENY PURSE SNATCH INCL.NO FORCE $200 & OVER
84
LARCENY SHOPLIFTING
8936
LARCENY SHOPLIFTING $200 & OVER
8936
LARCENY THEFT FROM BUILDING
7431
LARCENY THEFT FROM COIN-OP MACHINE
14
LARCENY THEFT FROM MV - NON-ACCESSORY
8720
LARCENY THEFT OF BICYCLE
3041
LARCENY THEFT OF MV PARTS & ACCESSORIES
2099
LARCENY VEH. ACCESSORY $200 & OVER
2099
LICENSE PREMISE VIOLATION
2998
LIQUOR - DRINKING IN PUBLIC
1278
LIQUOR - VIOLATION
35
LIQUOR LAW VIOLATION
35
M/V - LEAVING SCENE - PERSONAL INJURY
1200
M/V - LEAVING SCENE - PROPERTY DAMAGE
19533
M/V ACCIDENT - INVOLVING BICYCLE - INJURY
775
M/V ACCIDENT - INVOLVING BICYCLE - NO INJURY
327
M/V ACCIDENT - INVOLVING PEDESTRIAN - NO INJURY
432
M/V ACCIDENT - OTHER
4822
M/V ACCIDENT - OTHER CITY VEHICLE
853
M/V ACCIDENT - PERSONAL INJURY
4078
M/V ACCIDENT - POLICE VEHICLE
916
M/V ACCIDENT - PROPERTY DAMAGE
7727
M/V ACCIDENT INVOLVING PEDESTRIAN - INJURY
1742
M/V PLATES - LOST
1258
MANSLAUGHTER - TRAIN ETC. VICTIM NON-NEGLIGENCE
1
MANSLAUGHTER - VEHICLE - NEGLIGENCE
8
MISSING PERSON
2908
MISSING PERSON - LOCATED
8748
MISSING PERSON - NOT REPORTED - LOCATED
1584
MURDER NON-NEGLIGIENT MANSLAUGHTER
148
MURDER, NON-NEGLIGIENT MANSLAUGHTER
148
NOISY PARTY/RADIO-ARREST
3
NOISY PARTY/RADIO-NO ARREST
342
NOISY PARTY/RADIO/ETC.
3
OBSCENE PHONE CALLS
30
OPERATING UNDER INFLUENCE - ALCOHOL
371
OPERATING UNDER INFLUENCE - DRUGS
40
OPERATING UNDER THE INFLUENCE ALCOHOL
371
OPERATING UNDER THE INFLUENCE DRUGS
40
OTHER OFFENSE
310
POSSESSION OF BURGLARIOUS TOOLS
52
PRISONER - ESCAPE
1
PRISONER - SUICIDE / SUICIDE ATTEMPT
32
PRISONER - SUICIDE ATTEMPT
32
PRISONER ESCAPE / ESCAPE & RECAPTURE
1
PROPERTY - ACCIDENTAL DAMAGE
1892
PROPERTY - CONCEALING LEASED
16
PROPERTY - FOUND
7782
PROPERTY - LOST
18412
PROPERTY - LOST THEN LOCATED
400
PROPERTY - MISSING
388
PROPERTY - STOLEN THEN RECOVERED
274
PROSTITUTE - DERIVING SUPPORT
2
PROSTITUTION
26
PROSTITUTION - ASSISTING OR PROMOTING
2
PROSTITUTION - SOLICITING
118
PROTECTIVE CUSTODY / SAFEKEEPING
17
RECOVERED - MV RECOVERED IN BOSTON (STOLEN OUTSIDE BOSTON)
912
RECOVERED STOLEN PLATE
7
REPORT AFFECTING OTHER DEPTS.
78
ROBBERY - BANK
16
ROBBERY - CAR JACKING
8
ROBBERY - COMMERCIAL
65
ROBBERY - FIREARM - BANK
2964
ROBBERY - HOME INVASION
19
ROBBERY - KNIFE - CHAIN STORE
65
ROBBERY - OTHER
83
ROBBERY - STREET
2964
ROBBERY ATTEMPT - KNIFE - CHAIN STORE
16
ROBBERY ATTEMPT - OTHER WEAPON - MISCELLANEOUS
83
SAFEKEEPING
17
SEARCH WARRANT
1162
SERVICE TO OTHER PD INSIDE OF MA.
1470
SERVICE TO OTHER PD OUTSIDE OF MA.
1944
SICK/INJURED/MEDICAL - PERSON
23558
SICK/INJURED/MEDICAL - POLICE
3978
SIMPLE ASSAULT
9553
STOLEN PROPERTY - BUYING / RECEIVING / POSSESSING
527
SUDDEN DEATH
3804
SUICIDE
202
SUICIDE / SUICIDE ATTEMPT
202
THREATS TO DO BODILY HARM
14786
TOWED MOTOR VEHICLE
24204
TRESPASSING
3142
TRUANCY
16
TRUANCY / RUNAWAY
16
VAL - OPERATING AFTER REV/SUSP.
1780
VAL - OPERATING UNREG/UNINS CAR
166
VAL - OPERATING UNREG/UNINS CAR
166
VAL - OPERATING W/O AUTHORIZATION LAWFUL
364
VAL - OPERATING WITHOUT LICENSE
7680
VAL - VIOLATION OF AUTO LAW - OTHER
319
VANDALISM
26022
VANDALISM - GRAFFITI
392
VERBAL DISPUTE
17166
VIOL. OF RESTRAINING ORDER W ARREST
25
VIOL. OF RESTRAINING ORDER W NO ARREST
205
VIOLATION - CITY ORDINANCE
634
VIOLATION - HAWKER AND PEDDLER
16
VIOLATION - RESTRAINING ORDER
25
WARRANT ARREST
1824
WEAPON - FIREARM - CARRYING / POSSESSING, ETC
1181
WEAPON - FIREARM - OTHER VIOLATION
14
WEAPON - FIREARM - SALE / TRAFFICKING
1
WEAPON - OTHER - CARRYING / POSSESSING, ETC
107
WEAPON - OTHER - OTHER VIOLATION
14
Analysis and Visualization
1) What are the various crime categories in Boston, and which crimes are most commonly committed among these categories?
We have previously seen the various types of crimes that are committed in Boston. Now, we will analyze which crime is most commonly committed.
First, we need the frequency of the various crimes:
<- crime_dataset %>%
crime_freq group_by(OFFENCE_NAME) %>%
summarize(crime_count = n()) %>%
arrange(desc(crime_count))
crime_freq
Now we can use a bar chart to visualize the top 10 crimes that occur in Boston
ggplot(head(crime_freq,10), aes(x = reorder(OFFENCE_NAME, -crime_count), y = crime_count)) +
geom_bar(stat = "identity", fill = "lightpink") +
labs(x = "Crime Types", y = "Number of crimes", title = "Crime Distribution in Boston") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Interpretation: We see that crimes like Vandalism,Towed Motor Vehicle, Property Damage, Assault and Larceny shoplifting are the most common types of crime in Boston.These crimes exhibit the highest frequencies among the various crime categories in Boston.The bar chart visually represents the distribution of these crimes, with Vandalism showing the highest occurrence. A bar chart allows for a straightforward visual comparison of the frequency of different crimes. The length of each bar directly corresponds to the number of occurrences, making it easy to identify the most common crimes.
2) Whats the trend in crime over various years in Boston? How do the different years compare to each other? Is it increasing every year?
We can plot a line graph which gives us an overview trend of the total crimes in Boston from 2019-2023:
<- crime_dataset %>%
crime_dates mutate(Date = as.Date(OCCURRED_ON_DATE)) %>%
count(Date) %>%
mutate(Year = year(Date), Month = month(Date, label = TRUE))
ggplot(crime_dates, aes(x = Date, y = n)) +
geom_line(color = "steelblue") +
labs(x = "Date", y = "Number of Crimes", title = "Temporal Patterns of Reported Crimes")
Below, we compare the various years and see then trends:
<- list()
plots <- unique(crime_dates$Year)
unique_years for (year in unique_years) {
<- crime_dates %>% filter(Year == year)
filtered_data <- ggplot(filtered_data, aes(x = Date, y = n)) +
plot geom_line(color = "steelblue") +
labs(x = "Date", y = "Number of Crimes", title = paste("Temporal Patterns of Reported Crimes -", year))
as.character(year)]] <- plot
plots[[
}grid.arrange(grobs = plots,nrow = length(plots), ncol = 1)
Finally, we have a graph that shows us the total crime per year and how they differed:
<- crime_dataset %>%
crime_counts group_by(YEAR) %>%
summarize(Count = n())
ggplot(crime_counts, aes(x = YEAR, y = Count)) +
geom_point(color = "steelblue") +
labs(x = "Year", y = "Number of Crimes", title = "Trend in Crime Over Years")
Interpretation: We see that the crime rate has reduced from 2019 and then again slowly started to increase. The year 2020 is an anomaly and there are much fewer crimes committed in 2020 compared to other years. This may be due to COVID-19 pandemic. We also see that the crime has been increasing slowly since 2020. However it has been at a lower rate than 2019 which is a good sign.This may be due to increase and improved Law Enforcement and Social Programs and Support.
3) Which hours of the day has the highest number of crimes in Boston? Does this change over the years when compared to all the years?
<- ggplot(crime_dataset, aes(x = HOUR)) +
crime_hour_plot geom_bar(fill = "lightsalmon", color = "black") +
labs(x = "HOUR", y = "Number of Crimes", title = "Crimes During Different Hours")
+
crime_hour_plot theme(axis.text.x = element_text(angle = 90, hjust = 1)) # Rotate x-axis labels if needed
# Filter data for each year and create separate plots
<- list()
plots <- unique(crime_dataset$YEAR)
unique_years
for (year in unique_years) {
<- crime_dataset %>% filter(YEAR == year)
filtered_data
# Create a bar graph for each year
<- ggplot(filtered_data, aes(x = HOUR)) +
plot geom_bar(fill = "lightsalmon", color = "black") +
labs(x = "Hour", y = "Number of Crimes", title = paste("Crimes During Different Hours -", year)) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) # Rotate x-axis labels if needed
as.character(year)]] <- plot
plots[[
}
# Combine and display the plots next to each other
library(gridExtra)
grid.arrange(grobs = plots, nrows = length(plots))
Interpretation: We see that most of the crimes seem to happen at 12 am. This seems to be common for all years too. Another interesting thing to take note of it that the the crimes seem to reduce after 12 am. This could have been due to multiple reasons like decreased foot traffic. The number of people present on the streets is typically lower during these hours. With fewer potential victims or witnesses around, criminals may perceive a higher risk of detection or intervention, leading to a decrease in criminal activities.It could also be because the daily routines of most individuals involve sleeping during these hours.
4) Whats the trend in crime over various seasons in Boston? How do the different years compare to each other?
<- crime_dataset %>%
crime_season mutate(Season = case_when(
month(OCCURRED_ON_DATE) %in% c(3, 4, 5) ~ "Spring",
month(OCCURRED_ON_DATE) %in% c(6, 7, 8) ~ "Summer",
month(OCCURRED_ON_DATE) %in% c(9, 10, 11) ~ "Autumn",
TRUE ~ "Winter"
%>%
)) group_by(Season) %>%
summarise(Count = n())
Bar plot of crime distribution across seasons:
ggplot(crime_season, aes(x = Season, y = Count)) +
geom_bar(stat = "identity", fill = "skyblue") +
labs(x = "Season", y = "Number of Crimes", title = "Crime Counts by Season")
# Filter data for each year and create separate plots
<- list()
plots <- unique(crime_dataset$YEAR)
unique_years
for (year in unique_years) {
<- crime_dataset %>% filter(YEAR == year)
filtered_data
# Calculate crime counts by season for each year
<- filtered_data %>%
crime_season mutate(Season = case_when(
month(OCCURRED_ON_DATE) %in% c(3, 4, 5) ~ "Spring",
month(OCCURRED_ON_DATE) %in% c(6, 7, 8) ~ "Summer",
month(OCCURRED_ON_DATE) %in% c(9, 10, 11) ~ "Autumn",
TRUE ~ "Winter"
%>%
)) group_by(Season) %>%
summarise(Count = n())
# Create a bar plot for each year
<- ggplot(crime_season, aes(x = Season, y = Count)) +
plot geom_bar(stat = "identity", fill = "skyblue") +
geom_text(aes(label = Count), vjust = -0.5, color = "black", size = 2) + # Add total count labels on each bar
labs(x = "Season", y = "Number of Crimes", title = paste("Crime Counts by Season -", year))
as.character(year)]] <- plot
plots[[
}
# Combine and display the plots next to each other
library(gridExtra)
grid.arrange(grobs = plots, nrows = length(plots))
Interpretation: If we look at the overall data, we see that Summer has the most crimes. When we look at the individual years, we again see that summer is when most of the crimes happen. Crimes seem to happen more often during Summer than Spring,Winter and Autumn. This can be due to various reasons like Increased Outdoor Activities. During summer, people tend to spend more time outdoors, engaging in various activities. This higher level of outdoor presence can create more opportunities for crimes to occur. Another reason is that it is Vacation Season. Many people take vacations during the summer months, leaving their homes unattended. This can increase the likelihood of burglaries and property-related crimes.
5) Which streets was the has the most number of crimes in Boston. Where do most of the VANDALISM,ASSAULT and Robbery crimes happen in Boston?
<- crime_dataset %>%
crimes_by_street group_by(STREET) %>%
summarize(TotalCrimes = n()) %>%
top_n(10, TotalCrimes)
<- crimes_by_street %>%
crimes_by_street arrange(desc(TotalCrimes))
ggplot(crimes_by_street, aes(x = reorder(STREET, -TotalCrimes), y = TotalCrimes)) +
geom_bar(stat = "identity", fill = "steelblue") +
labs(title = "Top 10 Streets with the Most Crimes in Boston",
x = "Street",
y = "Total Number of Crimes") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
coord_flip()
<- crime_dataset %>% filter(STREET == "WASHINGTON ST")
washington_st_crimes <- washington_st_crimes %>% count(OFFENCE_NAME)
crime_counts <- crime_counts %>%
top_10_crimes arrange(desc(n)) %>%
head(10)
top_10_crimes
ggplot(top_10_crimes, aes(x = OFFENCE_NAME, y = n)) +
geom_bar(stat = "identity", fill = "steelblue") +
xlab("Crime Type") +
ylab("Number of Crimes") +
ggtitle("Top 10 Crimes on Washington St") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
<- crime_dataset %>%
crime_map mutate(offense_group = case_when(
str_detect(OFFENCE_NAME, regex("VANDALISM", ignore_case = TRUE)) ~ "Vandalism",
str_detect(OFFENCE_NAME, regex("MURDER", ignore_case = TRUE)) ~ "Murder",
str_detect(OFFENCE_NAME, regex("ROBBERY", ignore_case = TRUE)) ~ "Robbery",
TRUE ~ "Other"
))<- crime_map %>%
crime_map filter(str_detect(OFFENCE_NAME, regex("VANDALISM|MURDER|ROBBERY", ignore_case = TRUE)))
<- count(crime_map, offense_group)
offense_counts ggplot(offense_counts, aes(x = offense_group, y = n)) +
geom_bar(stat = "identity", fill = "lightgray") +
labs(title = "Crime Offenses", x = "Offense Group", y = "Count")
<- crime_dataset %>%
crimes_map filter(str_detect(OFFENCE_NAME, regex("VANDALISM|MURDER|ROBBERY", ignore_case = TRUE)))
<- addTiles(leaflet())
basemap = c('Red', 'Green', 'Blue')
colors = 1
i <- basemap
crimes for (crime in c('VANDALISM', 'MURDER', 'ROBBERY'))
{<- crimes_map[grepl(crime, crimes_map$OFFENCE_NAME, ignore.case = TRUE), ]
c <- addCircleMarkers(setView(crimes, lng = -71.08, lat = 42.33, zoom = 12), lng = c$Long, lat = c$Lat, radius = 1, fillOpacity = 6, color = colors[i])
crimes <- i + 1
i
} crimes
Interpretation: We see that Washington St has the highest number of crimes committed. We can dig deeper and see that the highest type of crimes committed in Washington St is PROPERTY - LOST. Possible that there are a lof of thefts or people are just misplacing things in the metro area. While comparing crimes like Vandalism,Murder and Robbery, we see that Vandalism has the highest number of committed crimes and Murder the lowest. From the map we can get hotspots of where the crimes are occurring and we also see that most areas seem to have both Vandalism and Robberies happening in the same place.Finally, another interesting thing we can find out from the map is that as we start to move away from the main areas of Boston, there seems to be more vandalism than robberies. We can also clearly see that the number of Murders are very less sine there are hardy any green points compared to the red and blue points on the map.
6) Are there any significant differences in crime rates between weekdays and weekends in Boston?
<- crime_dataset %>%
crime_counts group_by(DAY_OF_WEEK) %>%
summarize(TotalCrimes = n()) %>%
mutate(Weekend = ifelse(DAY_OF_WEEK %in% c("Saturday", "Sunday"), "Weekend", "Weekday"))
$DAY_OF_WEEK <- factor(crime_counts$DAY_OF_WEEK,
crime_countslevels = c("Monday", "Tuesday", "Wednesday", "Thursday",
"Friday", "Saturday", "Sunday"))
ggplot(crime_counts, aes(x = DAY_OF_WEEK, y = TotalCrimes, fill = Weekend)) +
geom_bar(stat = "identity") +
labs(title = "Crime Rates: Weekdays vs. Weekends",
x = "Day of the Week",
y = "Total Number of Crimes",
fill = "Weekend") +
scale_fill_manual(values = c("Weekday" = "steelblue", "Weekend" = "darkorange"))
Interpretation: From the graph we see that there is more crime during the weekdays than during the weekends. This might be due to a number of reasons like Increased Target Availability. On weekdays, residential areas and commercial establishments are often more populated and active, making them potential targets for crimes such as burglaries, thefts, or robberies. Additionally, weekdays may see more foot traffic, leading to higher chances of crimes like pick pocketing or street-level thefts.Also, during weekdays people typically follow a more predictable and structured routine, including going to work, school, or other regular activities. Criminals may take advantage of these patterns and target individuals or properties during weekdays when there may be fewer people around or when they can exploit vulnerabilities.
Conclusion and Discussion
In conclusion, the Boston Crime Dataset provides useful insights into crime patterns and trends in Boston. Researchers, analysts, and data enthusiasts have widely used the dataset to study crime categories, analyze trends over time, examine crime occurrence by hour and season, identify high-crime areas, and compare crime rates on weekdays versus weekends. According to the data analysis, the most commonly committed types of crimes in Boston are Vandalism, Towed Motor Vehicle, Property Damage, Assault, and Larceny Shoplifting. Among the various crime categories, these crimes have the highest frequency. The trend in crime over the years shows a decrease in crime beginning in 2019, with an anomaly in 2020, most likely influenced by the COVID-19 pandemic. Since 2020, the crime rate has gradually increased, but at a slower rate than in 2019. This implies that increased law enforcement and social programs may have contributed to the slower increase in crime. When the hourly distribution of crimes is examined, it is discovered that the majority of crimes occur at 12 a.m., and this pattern appears to be consistent over time. However, crimes tend to decrease after 12 a.m., which can be attributed to factors such as reduced foot traffic and fewer potential victims or witnesses. When crime rates are examined by season, it is discovered that Summer has the highest overall number of crimes. This is due to an increase in outdoor activities and the vacation season, which creates more opportunities for crime to occur. According to an analysis of crime occurrence on different streets, Washington St has the highest number of crimes committed. Further investigation into specific crime types reveals that Vandalism has the most committed crimes, while Murder has the fewer compared to others. Vandalism and robberies occur in close proximity in certain areas of Boston. Furthermore, as one moves away from Boston’s main areas, the incidence of vandalism tends to outnumber robberies. When comparing weekdays and weekends, it is discovered that crime rates are higher on weekdays than on weekends. This is due to increased target availability, predictable routines, and higher foot traffic during the week. When there are fewer people around or vulnerabilities can be exploited, criminals may take advantage of these factors to target individuals or properties. Overall, the Boston Crime Dataset provides researchers, policymakers, and law enforcement agencies with a comprehensive understanding of crime in the city, assisting them in their efforts to address and mitigate criminal activity. This dataset’s insights can be used to inform strategies for crime prevention, resource allocation, and the development of proactive measures to ensure the safety and well-being of the Boston community.
Bibliography
[1] https://data.boston.gov/dataset/crime-incident-reports-august-2015-to-date-source-new-system
[2] Posit team (2022). RStudio: Integrated Development Environment for R. Posit Software, PBC, Boston, MA. URL http://www.posit.co/.
[3] R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.