Data Reading and Wrangling
Below I have outputted the first 6 rows of the purchase data. It comes from the FBI’s National Instant Criminal Background Check System (NICS). The purpose of NICS is to help Federal Firearms Licensees (FFL’s) determine if an individual who wants to buy a firearm is eligible. This data is important because it helps people to uncover trends in the purchase of firearms and to develop informed legislation. Here in this dataset we have a summary of this information, categorized by month and state:
month state permit permit_recheck handgun long_gun other
1 2022-02 Alabama 25401 499 21822 14541 1351
2 2022-02 Alaska 301 0 2644 2178 348
3 2022-02 Arizona 2560 473 20150 9935 1690
4 2022-02 Arkansas 1842 309 7780 5756 429
5 2022-02 California 15815 10550 36362 23017 4941
6 2022-02 Colorado 7055 19 19591 12402 1590
multiple admin prepawn_handgun prepawn_long_gun prepawn_other
1 1260 0 13 5 0
2 202 0 0 1 1
3 1153 0 11 1 0
4 515 4 15 9 1
5 1 0 1 2 0
6 1683 0 0 0 0
redemption_handgun redemption_long_gun redemption_other
1 2989 1109 12
2 127 74 2
3 1461 487 5
4 1323 1001 1
5 633 313 52
6 0 0 0
returned_handgun returned_long_gun returned_other rentals_handgun
1 36 0 0 0
2 23 9 0 0
3 180 12 1 0
4 0 0 0 0
5 1942 1090 183 0
6 296 53 5 0
rentals_long_gun private_sale_handgun private_sale_long_gun
1 0 28 29
2 0 2 4
3 0 15 13
4 0 5 11
5 0 7638 3090
6 0 0 0
private_sale_other return_to_seller_handgun
1 2 1
2 0 0
3 0 0
4 1 0
5 626 19
6 0 0
return_to_seller_long_gun return_to_seller_other totals
1 0 0 69098
2 0 0 5916
3 2 0 38149
4 0 0 19002
5 20 0 106295
6 0 0 42694
The 27 variables are as follows:
-Month (character) - describes the month and year, in the format YYYY-MM
-State (character) - state in the United States
-Permit (integer) - the total number of background checks made on individuals who want to possess a permit
-Permit Recheck (integer) - total number of times FFL’s conducted a second background check on individuals who want to possess a permit
-Handgun (integer) - total number of buyer background checks made for purchase of handguns (one type of firearm)
-Long gun (integer) - total number of background checks made for purchase of long guns (another type of firearm)
-Other (integer) - total number of background checks made for purchase of firearms which do not classify as handguns nor long guns
-Multiple (integer) - total number of background checks made for multiple guns at once
-Admin (integer) - total number of administration checks
Definition of Prepawn Variables: total number of background checks requested by individuals who want to pledge or pawn a firearm as security for payment, before actually pledging/pawning
-Prepawn Handgun (integer) - total number of background checks requested by people who want to prepawn handguns
-Prepawn Long Gun (integer) - total number of background checks requested by people who want to prepawn long guns
-Prepawn Other (integer) - total number of background checks requested by people who want to prepawn firearms other than handguns or long guns
Definition of Redemption Variables: total number of background checks requested on individuals who want to regain possession of a firearm after pledging or pawning as security
-Redemption Handgun (integer) - total number of checks for redemption of handguns
-Redemption Long Gun (integer) - total number of checks for redemption of long guns
-Redemption Other (integer) - total number of checks for redemption of firearms other than handguns or long guns
Definition of Return Variables: Background checks requested by criminal justice before returning a firearm to an individual who used to own it
-Returned Handgun (integer) - total number of checks for returns of handguns
-Returned Long Gun (integer) - total number of checks for returns of long guns
-Returned Other (integer) - total number of return checks for firearms other than handguns or long guns
Definition of Rental Variables: background checks requested by FFL about individuals who want to possess a firearm which has been loaned or rented for use off premises of business
-Rentals Handgun (integer) - total number of handgun rental checks
-Rentals Long Gun (integer) - total number of long gun rental checks
-Rentals Other (integer) - total number of rental checks of firearms other than handguns or long guns
Definition of Private Sale Variables: total number of background checks requested by an FFL on an individual who wants to possess a firearm from a private party seller
-Private Sale Handgun (integer) - total number of private sale checks of handguns
-Private Sale Long Gun (integer) - total number of private sale checks of handguns
-Private Sale Other (integer) - total number of private sale checks for firearms other than handguns and long guns
Definition of Return to Seller Variable: total number of background checks requested by an FFL on individuals who want to return a firearm to a private party seller
-Return to Seller Handgun (integer) - total number of return to seller checks for handguns
-Return to Seller Long Gun (integer) - total number of return to seller checks for long guns
-Return to Seller Other (integer) - total number of return to seller checks for firearms other than handguns and long guns
-Totals (integer) - total number of background checks for a particular state and month
Now I will perform some basic data-wrangling operations.
#Data set which contains only the information on Alabama:
purchase_Alabama <- purchase %>% filter(state == "Alabama")
print(head(purchase_Alabama))
month state permit permit_recheck handgun long_gun other
1 2022-02 Alabama 25401 499 21822 14541 1351
2 2022-01 Alabama 26820 499 17571 12669 1524
3 2021-12 Alabama 27674 281 30428 26932 1875
4 2021-11 Alabama 24489 229 22126 21230 1319
5 2021-10 Alabama 25822 258 19188 15531 1309
6 2021-09 Alabama 26657 423 18034 15390 1365
multiple admin prepawn_handgun prepawn_long_gun prepawn_other
1 1260 0 13 5 0
2 880 0 14 4 0
3 1498 0 13 10 2
4 1265 0 15 11 2
5 956 0 12 13 3
6 912 0 16 7 0
redemption_handgun redemption_long_gun redemption_other
1 2989 1109 12
2 1822 859 12
3 2346 999 19
4 2162 1202 14
5 2354 967 12
6 2094 887 16
returned_handgun returned_long_gun returned_other rentals_handgun
1 36 0 0 0
2 30 0 0 0
3 37 0 0 0
4 39 0 0 0
5 18 0 0 0
6 35 0 0 0
rentals_long_gun private_sale_handgun private_sale_long_gun
1 0 28 29
2 0 21 24
3 0 37 29
4 0 23 26
5 0 26 23
6 0 27 30
private_sale_other return_to_seller_handgun
1 2 1
2 13 1
3 8 2
4 4 2
5 6 0
6 7 0
return_to_seller_long_gun return_to_seller_other totals
1 0 0 69098
2 1 0 62764
3 1 0 92191
4 0 0 74158
5 1 0 66499
6 0 0 65900
#Add a row to data set which contains a column for year (previously, we only had months):
purchase_Alabama2 <- purchase_Alabama %>% mutate(year = as.numeric(substr(purchase_Alabama$month, 1, 4)))
#Create a new data set which contains only one row per year, which represents the month in that year with the greatest number of background checks ("totals"). Then arrange the rows in descending order in terms of the "totals" variable, such that the first row contains the data for the year for which the max number of totals is higher than that of any other year.
purchase_Alabama3 <- purchase_Alabama2 %>%
group_by(year) %>%
arrange(desc(totals)) %>%
slice(1) %>%
ungroup() %>%
arrange(desc(totals))
print(head(purchase_Alabama3))
# A tibble: 6 × 28
month state permit permit_recheck handgun long_gun other multiple
<chr> <chr> <int> <int> <int> <int> <int> <int>
1 2020-06 Alabama 64643 1171 47159 20120 2604 1732
2 2015-12 Alabama 31359 NA 47605 33710 1698 1752
3 2021-03 Alabama 42992 690 34675 20702 2048 1599
4 2019-12 Alabama 33683 485 33020 25882 1560 1382
5 2012-12 Alabama 777 NA 30614 42433 777 1673
6 2016-02 Alabama 24746 0 29311 15054 950 1162
# … with 20 more variables: admin <int>, prepawn_handgun <int>,
# prepawn_long_gun <int>, prepawn_other <int>,
# redemption_handgun <int>, redemption_long_gun <int>,
# redemption_other <int>, returned_handgun <int>,
# returned_long_gun <int>, returned_other <int>,
# rentals_handgun <int>, rentals_long_gun <int>,
# private_sale_handgun <int>, private_sale_long_gun <int>, …
As can be seen in the output above, June 2020 was the month with the highest number of total background checks in Alabama.
Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Laidler (2022, April 3). Data Analytics and Computational Social Science: Homework 2. Retrieved from https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomericalaidler884748/
BibTeX citation
@misc{laidler2022homework, author = {Laidler, Erica}, title = {Data Analytics and Computational Social Science: Homework 2}, url = {https://github.com/DACSS/dacss_course_website/posts/httpsrpubscomericalaidler884748/}, year = {2022} }