-
Notifications
You must be signed in to change notification settings - Fork 1
/
17_nibrs_property.Rmd
364 lines (300 loc) · 28.8 KB
/
17_nibrs_property.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
# Property Segment {#property}
```{r, echo=FALSE}
knitr::opts_chunk$set(
echo = FALSE,
warning = FALSE,
error = FALSE,
message = FALSE
)
```
```{r, results='hide'}
property <- readRDS("data/nibrs_property_segment_2023.rds")
window_property <- readRDS("data/nibrs_window_recovered_property_segment_2021.rds")
nibrs_property_summary <- readRDS("data/nibrs_summary_stats/nibrs_property_summary_stats.rds")
drug_first_year <- readRDS("data/nibrs_summary_stats/drug_first_year.rds")
property_first_year <- readRDS("data/nibrs_summary_stats/property_first_year.rds")
gc()
```
The Property Segment provides a bit more information than would be expected from the name. For each item involved in the crime it tells you what category that items falls into, with 68 total categories (including "other") ranging from explosives and pets to money and alcohol. It also tells you the estimated value of that item. This data covers more than just items stolen during a crime. For each item it tells you what happened to that item such as if it was stolen, damaged, seized by police (such as illegal items like drugs), recovered (from being stolen) by police, or burned during an arson.
For drug offenses it includes the drugs seized by police. For these offenses, the data tells us the type of drug, with 16 different drug categories ranging from specific ones like marijuana or heroin to broader categories such as "other narcotics". There can be up to three different drugs included in this data - if the person has more than three types of drugs seized then the third drug category will simply indicate that there are more than three drugs, so we learn what the first two drugs are but not the third or greater drugs are in these cases. For each drug we also know exactly how much was seized with one variable saying the amount the police found and another saying the units we should we reading that amount as (e.g. pills, grams, plants).
The Window Property Segment has the same variables as the normal Property Segment but also has 10 variables on each of the offenses committed (up to 10 offenses) during the incident. This is really to try to provide a bit of information that you’d otherwise get from the other segments but don’t since this is a window segment. For the rest of this chapter I’ll be using examples from the Property Segment and not the Window Property.
## Type of property loss {#propertyLoss}
This segment contains information on all property involved in the incident, other than the weapon used by the offender. Property can be involved in multiple ways - such as being stolen, destroyed, or that the stolen property is recovered by the police - so NIBRS breaks this information into seven different categories (eight when including "unknown" type). Figure \@ref(fig:propertyTypeLoss) shows each of these categories as how often they occur. The most common category, at 60.7% of the rows in this segment, is when the item is taken from the victim by the offender. This includes when the offender stole the item, took it by force in a robbery, tricked the victim (e.g. offender committed fraud), and any other way to illegal get the item (e.g. extortion, ransom, bribery). Though it includes all these types of ways to illegally take the item, it is often just called "stolen" property and I will refer to it as such in this chapter just for simplicity of writing.
The next most common group, at 13.3%, is when the item was seized by the police. This excludes items that were stolen and that the police recovered. It includes all types of property that the offender illegally had but is primary for drug crimes where the drug or drug equipment was seized. This segment also includes property damaged, destroyed, or vandalized by the offender and this type makes up 11.2% of the data. Following this, about 9% of rows are for recovered property which is when one of the previously stolen items is recovered by the police.
Next is "none" which only means that no property was stolen or damaged but that it could have been. For example, if someone tries to grab a person's purse but fails, that would be considered "none" since the purse was not actually taken. The remaining types are when the item was a counterfeit/forgery, at 1.2%, was burned (such as during an arson) at 0.2%, or when the type of loss is unknown at 0.2%
```{r propertyTypeLoss, fig.cap = "The type of loss or if the item is recovered, 2023."}
property$type_of_property_loss_temp <- capitalize_words(property$type_of_property_loss)
property$type_of_property_loss_temp <- gsub(" \\(.*", "", property$type_of_property_loss_temp, ignore.case = TRUE)
property$type_of_property_loss_temp[property$type_of_property_loss_temp == "Stolen/Etc."] <- "Stolen/Robbed/Etc."
make_barplots(property, "type_of_property_loss_temp", count = FALSE, ylab = "% of Property")
```
## Description of property
There are 68 different categories of properties (including the catch-all "other" category for anything not explicitly in a different category) that NIBRS covers. This also includes "identity - intangible" which means that someone stole the victim's identity. Identity is not property since it is not a physical thing but is nonetheless included here (items related to one's identity such as social security cards are included in a different category called "identity documents"). Each of these categories can appear in any of the seven different types of property loss discussed above. For simplicity of writing, and because Table \@ref(tab:propertyStolenDescription) covers items stolen, I will be talking below about items lost by being stolen, though everything does apply to other types of loss too.
The property included in NIBRS can move from very broad categories like "merchandise" to more specific items such as livestock, aircraft, and pets. The property categories are mutually exclusive so a single item cannon be counted in different categories. If, for example, laptop is stolen that could potentially go in "computer hardware/software" or "office-type equipment" but it would not be in both, it would only be recorded in one of the two. In cases where multiple items of the same type are stolen - such as someone stealing multiple laptops in a single crime - we do not actually know how many items were stolen. Just that at least one item of that category was stolen in the incident. We then know the total estimated value of the items stolen in that incident which we can use to estimate the number of items stolen (as long as the know the average value per item in that category) though this would be a very imprecise measure. The exception to this is for stolen vehicles where we know exactly how many were stolen and how many were recovered.
Table \@ref(tab:propertyStolenDescription) shows each of the 68 different types of property in NIBRS and shows the number of incidents where they were stolen for all incidents in the 2022 NIBRS data. Multiple different categories of property can be stolen in a single incident. The most common type of property stolen, at 14.3% of all property is "other" which is not really that helpful to us. We know it is not one of the other 67 categories but not exactly what was stolen. Next, are money at 11.8% and then purses/handbags/wallets at 5.6%. This makes a lot of sense. People steal things for financial gain and the easiest way to get that gain is stealing money directly (or wallets and purses which often have money inside). Stealing property that you then have to sell or trade to get what you want (money or other property) is a lot harder, which is likely why it is less common.
There are about 4.8 million rows in the Property Segment where the item was stolen (the other 3.1 million rows are about property that was seized by police, recovered, damaged, or one of the other types of property loss detailed in Section \@ref(propertyLoss)). As such, even items stolen very rarely can occur thousands or tens of thousands of times. For example, pets were the property stolen about 0.14% of the time and that tiny percent still accounts of 6,821 incidents with a pet stolen. Given the huge number of rows in this data - and as more agencies report to NIBRS this will grow quickly - it can be possible to study specific types of property, such as pets stolen, that would not be possible with a more narrow dataset (both in terms of how specific the items they include are, and the size of the data).
```{r }
temp_seized <- make_frequency_table(property[property$type_of_property_loss %in% "seized", ],
"property_description",
c("Property",
"# of Property Seized",
"% of Property Seized"))
temp_recovered <- make_frequency_table(property[property$type_of_property_loss %in% "recovered", ],
"property_description",
c("Property",
"# of Property Recovered",
"% of Property Recovered"))
temp <- make_frequency_table(property[property$type_of_property_loss %in% "stolen (includes bribed, defrauded, embezzled, extorted, ransomed, robbed, etc.)", ],
"property_description",
c("Property",
"# of Property Stolen",
"% of Property Stolen")) %>%
left_join(property_first_year %>%
mutate(property_description = capitalize_words(property_description)) %>%
rename(`Property` = property_description)) %>%
select(`Property`,
`First Year` = year,
everything()) %>%
left_join(temp_seized) %>%
left_join(temp_recovered)
temp$`First Year`[is.na(temp$`First Year`)] <- "-"
temp$`# of Property Stolen` <- paste0(temp$`# of Property Stolen`, " (",
temp$`% of Property Stolen`, ")")
temp$`% of Property Stolen` <- NULL
temp$`# of Property Seized` <- paste0(temp$`# of Property Seized`, " (",
temp$`% of Property Seized`, ")")
temp$`% of Property Seized` <- NULL
temp$`# of Property Recovered` <- paste0(temp$`# of Property Recovered`, " (",
temp$`% of Property Recovered`, ")")
temp$`% of Property Recovered` <- NULL
temp <-
temp %>%
rename(`# (%) of Property Stolen` = `# of Property Stolen`,
`# (%) of Property Seized` = `# of Property Seized`,
`# (%) of Property Recovered` = `# of Property Recovered`)
kableExtra::kbl(temp,
#format = "html",
digits = 2,
align = c("l", "l", "r", "r", "r"),
#booktabs = TRUE,
longtable = TRUE,
label = "propertyStolenDescription",
escape = TRUE,
caption = "The number and percent of property stolen (including forcibly taken such as during a robbery) and property seized by police (excludes recovering property that was stolen), and recovered by police, 2023. Each incident can have multiple items stolen.") %>%
kable_styling(bootstrap_options = "striped", full_width = FALSE, latex_options = c("hold_position", "repeat_header"))
```
As seen in Table \@ref(tab:propertySeizedDescription), all of the 68 different types of properties available in NIBRS can and, in 2022, were seized by police during an incident at least once. This includes atypical property like building material, musical instruments, and pets (and leads me to think that at least some of this is incorrectly labeled as property seized by the police). The vast majority of property seized by police, however, is drugs. 63.2% of the property seized were drugs themselves while 27.3% were equipment related to the drugs. The remaining items were mostly "other" (i.e. anything not explicitly categorized here) at 2.6%, money at 2%, firearms at 0.9%, and then a bunch of very rarely seized items.
## Value of stolen property
For all types of property loss other than the property being seized by the police (and when the type is "unknown") there is data on the estimated value of that property. This is estimates by the police but is supposed to be the current value of the item (e.g. a stolen car is what it currently sells for, not what the buyer bought it for) and is the cost it will take for the victim to replace the item. To be a bit more specific, this variable is the sum of items stolen in this category. For example, if someone burglarizes a house and steals three rings from the victim, this would not count as three thefts of a ring. It would be recorded as loss of jewelry and the value would be the total value of all three rings.
The police can take the victim's estimation into consideration but are not supposed to immediately accept it as victims may exaggerate values (especially for insurance purposes). When items are recovered, the police put in the value at the time of recovery which may be substantially different than at the time of the loss if the item is damaged or destroyed.
We can use this variable to look at the value of items lost by the victim. Figure \@ref(fig:propertyValuePropertyLevel) looks at the value per item stolen in a crime (if incidents have multiple items stolen, this counts them all separately). This includes items lost by theft, robbery, and burglary so is part of the "Stolen/Robbed/Etc." category of types of property loss. It excludes items damaged or destroyed or burned (there is information about the value of property in these incidents but these are not shown in the figure). To make this graph a bit simpler I have rounded all values to the nearest \$100 so items valued at \$0 mean that they are worth between \$1 and \$50.
```{r propertyValuePropertyLevel, fig.cap = "The distribution of the value of property stolen. Each value is rounded to the nearest 100. The x-axis is set on the log scale as this distribution is hugely skewed. Values over 1,0000,00 are set to 1,000,000."}
temp <-
property %>%
filter(type_of_property_loss %in% "stolen (includes bribed, defrauded, embezzled, extorted, ransomed, robbed, etc.)") %>%
mutate(value_of_property = parse_number(value_of_property)) %>%
filter(!is.na(value_of_property))
temp$value_of_property[temp$value_of_property > 1000000] <- 1000000
temp <-
temp %>%
mutate(value_of_property_100 = plyr::round_any(value_of_property, 100)) %>%
group_by(value_of_property_100) %>%
tally()
temp$proportion <- temp$n / sum(temp$n)
ggplot(temp, aes(x = value_of_property_100, y = proportion)) +
geom_line(size = 1.02) +
scale_y_continuous(labels = scales::percent, expand = c(0, 0), limits = c(0, NA)) +
scale_x_log10(labels = scales::comma) +
xlab("Property Value ($)") +
ylab("% of Property") +
theme_crim()
```
```{r}
nibrs_property_summary %>%
select(year,
mean_value_stolen,
median_value_stolen,
max_value_stolen,
mean_value_recovered,
median_value_recovered,
max_value_recovered) %>%
mutate_if(is.numeric, round, 0) %>%
mutate_if(is.numeric, prettyNum, big.mark = ",") %>%
mutate(year = gsub(",", "", year)) %>%
rename(Year = year,
`Mean Value Stolen` = mean_value_stolen,
`Median Value Stolen` = median_value_stolen,
`Max Value Stolen` = max_value_stolen,
`Mean Value Recovered` = mean_value_recovered,
`Median Value Recovered` = median_value_recovered,
`Max Value Recovered` = max_value_recovered,
) %>%
kableExtra::kbl(# format = "html",
digits = 2,
align = c("l", "r", "r"),
#booktabs = TRUE,
longtable = TRUE,
escape = TRUE,
label = "nibrsPropertyStolenValue",
caption = "Annual mean, median, and maximum values (in dollars) of property stolen and recovered, 1991-2023.") %>%
kable_styling(bootstrap_options = "striped", full_width = FALSE, latex_options = c("hold_position", "repeat_header"))
```
```{r nibrsPropertyMaxValuePercent, fig.cap="Annual percent of the value of all property stolen that is made up of the value that is the maximum dollar amount reported in that year, 1991-2023."}
nibrs_property_summary %>%
ggplot(aes(x = year, y = percent_stolen_max_value)) +
geom_line(linewidth = 1.05) +
xlab("Year") +
ylab("% of Incidents") +
theme_crim() +
scale_y_continuous(labels = scales::percent, expand = c(0, 0), limits = c(0, NA)) +
labs(color = "") +
expand_limits(y = 0) +
scale_x_continuous(breaks = time_series_x_axis_year_breaks)
```
## Date property was recovered
This segment tells us both when the incident happened and, for stolen property, when the item was recovered. We can use this to look at how long it generally takes for property to be recovered (though most property stolen is never recovered). Figure \@ref(fig:propertyDaysUntilRecovered) shows the number of days it takes for property to be recovered. Though this data gives us the exact date, allowing for the precise number of days from property loss to recovery, this graph groups days greater than nine days to simplify the graph. The maximum number of days in the 2022 NIBRS data is 450 days so showing all days would be a rather unhelpful graph.
The majority - 60.8% - of property lost is recovered on the same day, which is shown as zero days. We saw in Figure \@ref(fig:arrestsDaysUntilArrest) that the vast majority of arrests happen on the same day as the incident so it makes sense the most property would too.^[I would expect most property to be recovered on the arrestee's body.] A smaller and smaller share of property is recovered as the number of days from the incident increase, a trend also found in the time to arrest graph. The lesson here seems to be that if you are a victim of a crime and had something taken, unless it is recovered very quickly it is unlikely to be recovered at all.
```{r propertyDaysUntilRecovered, fig.cap = "The distribution of the number of days from the incident to the property recovered date. Zero days means that the arrest occurred on the same day as the incident."}
property$days_until_recovery <- as.numeric(ymd(property$date_recovered) - ymd(property$incident_date))
max_days <- max(property$days_until_recovery, na.rm = TRUE)
temp <-
property %>%
filter(!is.na(date_recovered)) %>%
mutate(days_until_recovery = as.numeric(days_until_recovery)) %>%
filter(days_until_recovery >= 0)
temp$days_until_recovery[temp$days_until_recovery %in% 10:30] <- "10-30"
temp$days_until_recovery[temp$days_until_recovery %in% 31:90] <- "31-90"
temp$days_until_recovery[temp$days_until_recovery %in% 91:max_days] <- "91+"
temp <-
temp %>%
group_by(days_until_recovery) %>%
count()
temp$days_until_recovery <- factor(temp$days_until_recovery,
rev(stringr::str_sort(temp$days_until_recovery, numeric = TRUE)))
temp$proportion <- temp$n / sum(temp$n)
ggplot(temp, aes(x = days_until_recovery, y = proportion)) +
geom_col() +
coord_flip() +
ggplot2::scale_y_continuous(labels = scales::percent, expand = c(0, 0), limits = c(0, NA)) +
xlab("Days Until Recovery") +
ylab("% of Recoveries") +
theme_crim()
```
## Drugs
This segment also provides information about drugs seized by the police. This also includes cases where the police would have seized the drugs if the offender did not destroy it somehow, such as flushing it down the toilet. For each drug seized there is information on what type of drug it was and how much of the drug was seized. There can be up to three drugs seized in an incident and data is available for each type of drug. The exception, however, is when there are more than three drugs seized in an incident. In that case, information on the third drug just says that there were more than three drugs involved. So you would have information on the first two drugs but none on the third through however many (and it does not say how many) drugs. For the below table and figure I only look at the first drug seized, so the numbers shown are undercounts.
The ordering of drugs when there are multiple drugs is based on how much drugs were recovered and the seriousness of the drugs. For example, heroin is probably considered more serious than marijuana, but overall ranking of drugs is a subjective assessment depending on the department. Is heroin more serious than meth? That decision likely varies by the agency and their situation in regard to what drugs they often seize.
### Suspected drug type
The drugs in NIBRS are the "suspected drug types" which means that they are what the police believe the drugs to be, even if there is not definitive proof (such as a crime lab testing for what type of drug it is) that the drug is what they say it is. There are 15 possible drug types in NIBRS (16 when including "unknown drug type") and each is shown in Table \@ref(tab:propertyDrugs) along with how often they occur in the data. Some of these drug types are specific enough to only include a single drug while others are groups of drug types, such as hallucinogens or stimulants (and they include all of these types other than specifically stated drugs).
+ Amphetamines/Methamphetamines
+ Barbiturates
+ Cocaine (All Forms Except Crack)
+ Crack Cocaine
+ Hashish
+ Heroin
+ LSD
+ Marijuana
+ Morphine
+ Opium
+ Other Depressants: Glutethimide Or Doriden, Methaqualone Or Quaalude, Pentazocine Or Talwin, Etc.
+ Other Drugs: Antidepressants (Elavil, Triavil, Tofranil, Etc.), Aromatic Hydrocarbons, Propoxyphene Or Darvon, Tranquilizers (Chlordiazepoxide Or Librium, Diazepam Or Valium, Etc.), Etc.
+ Other Hallucinogrens: BMDA (White Acid), DMT, MDA, MDMA, Mescaline Or Peyote, Psilocybin, STP, Etc.
+ Other Narcotics: Codeine, Demerol, Dihydromorphinone Or Dilaudid, Hydrocodone Or Percodan, Methadone, Etc.
+ Other Stimulants: Adipex, Fastine And Ionamin (Derivatives of Phentermine), Benzedrine, Didrex, Methylphenidate Or Ritalin, Phenmetrazine Or Preludin, Tenuate, Etc.
+ PCP
+ Unknown Type Drug
Not too surprising, marijuana is the most common drug seized at 47% - or 455k incidents with it seized - of the data. This is followed by amphetamines/methamphetamines (including what we'd normally just call meth) at 20.7% and then heroin at 8.5%. Interestingly, cocaine and crack cocaine (which are always separate categories) both occur in 5.09% of drugs seized. Given the large disparity in sentences for these types of drugs, and that "crack wars" were a major purported cause of violent crime in the 1980s and 1990s, I expected crack cocaine to be much more common than normal cocaine. The remaining drug types are all less than 5% of drugs seized each and has some groupings of drug types (e.g. stimulants) rather than specific drug types (though some of these categories, such as LSD, are specific drugs).
```{r }
temp <- make_frequency_table(property,
"suspected_drug_type_1",
c("Drug Type",
"\\# of Drugs",
"\\% of Drugs")) %>%
left_join(drug_first_year %>%
filter(!is.na(suspected_drug_type_1)) %>%
mutate(suspected_drug_type_1 = capitalize_words(suspected_drug_type_1)) %>%
rename(`Drug Type` = suspected_drug_type_1)) %>%
select(`Drug Type`,
`First Year` = year,
everything())
temp$`First Year`[is.na(temp$`First Year`)] <- "-"
temp$`Drug Type` <- gsub("lsd", "LSD", temp$`Drug Type`, ignore.case = TRUE)
temp$`Drug Type` <- gsub("bmda", "BMDA", temp$`Drug Type`, ignore.case = TRUE)
temp$`Drug Type` <- gsub("dmt", "DMT", temp$`Drug Type`, ignore.case = TRUE)
temp$`Drug Type` <- gsub("mda", "MDA", temp$`Drug Type`, ignore.case = TRUE)
temp$`Drug Type` <- gsub("mdma", "MDMA", temp$`Drug Type`, ignore.case = TRUE)
temp$`Drug Type` <- gsub("stp", "STP", temp$`Drug Type`, ignore.case = TRUE)
temp$`Drug Type` <- gsub("pcp", "PCP", temp$`Drug Type`, ignore.case = TRUE)
temp$`Drug Type` <- gsub(":.*", "", temp$`Drug Type`)
kableExtra::kbl(temp,
# format = "html",
digits = 2,
align = c("l", "r", "r"),
#booktabs = TRUE,
longtable = TRUE,
label = "propertyDrugs",
escape = TRUE,
caption = "The number and percent of drugs seized by police by type of drug, for all drugs seized in 2022.") %>%
kable_styling(bootstrap_options = "striped", full_width = FALSE, latex_options = c("hold_position", "repeat_header"))
```
```{r nibrsPropertyDrugType, fig.cap="Annual percent of drug seizures by drug type, for the 1st drug reported, 1991-2023."}
nibrs_property_summary %>%
ggplot(aes(x = year)) +
geom_line(aes(y = percent_marijuana_hashish, color = "Marijuana/Hashish"), linewidth = 1.05) +
geom_line(aes(y = percent_cocaine_crack, color = "Cocaine/Crack"), linewidth = 1.05) +
geom_line(aes(y = percent_meth , color = "Methamphetamine"), linewidth = 1.05) +
geom_line(aes(y = percent_heroin_opium , color = "Heroin/Opium"), linewidth = 1.05) +
geom_line(aes(y = percent_other_drug , color = "Other Drug"), linewidth = 1.05) +
geom_line(aes(y = percent_unknown_drug , color = "Unknown Drug"), linewidth = 1.05) +
xlab("Year") +
ylab("% of Incidents") +
theme_crim() +
scale_y_continuous(labels = scales::percent, expand = c(0, 0), limits = c(0, NA)) +
labs(color = "") +
expand_limits(y = 0) +
scale_color_manual(values = c("Marijuana/Hashish" = "black",
"Cocaine/Crack" = "#d7191c",
"Methamphetamine" = "#fdae61",
"Heroin/Opium" = "#7570b3",
"Other Drug" = "#abd9e9",
"Unknown Drug" = "#2c7bb6"))
```
### Amount of drugs
For each drug type we know exactly how much was seized (at least we do other than for the 6.7% where the amount is "not reported"). Since different drug types are measured in different ways, this data also tells us what units the amount seized is in. So you'll need to look at both the amount and the units to understand how much drugs were actually seized. The possible units are listed below:
* Dosage Unit/Items (Pills, Etc.)
* Fluid Ounce
* Gallon
* Gram
* Kilogram
* Liter
* Milliliter
* Number of Plants
* Ounce
* Pound
Once you know the units you can look at the amount of drugs seized. The amount is specific up to the thousandths place though the integer and the numbers after the decimal point are actually in different columns in the data. For example, if police seized 1.257 grams of heroin, the 1 gram and the 0.257 grams would be in separate columns. As an example, we will look at Figure \@ref(fig:propertyMarijuanaGramMeasures) which shows the number of grams seized for marijuana. This is only looking at the column with the amount in integers, so parts of a gram are excluded (but are available in the data). This also excludes any case where the marijuana seized was measured in a unit other than gram, such as number of plants, ounces, or pounds. Even though the data shows the number of grams as actual integers, not grouped at all, I do group the larger values together to make the graph simpler.
So with those caveats, we can see what amounts of marijuana, measured in grams, are most frequently seized. Generally, the amount of marijuana seized is in small amounts with 22.5% being 1-2 grams (since we do not include the parts of a gram we can only say that it is 1 to 1.999 grams) and 18.6% being less than one gram. As the amount of drugs increase, the percent of seizures that involve this number of drugs decreases. It's a bit curious that they include grams for values over 28 since 28.3495 grams is one ounce so it would make sense to just start reporting in units of ounces instead of just increasingly large number of grams.
```{r propertyMarijuanaGramMeasures, fig.cap = "For drugs seized that are measured in grams, this figure shows the distribution in the number of grams seized. Values over 10 grams are grouped together for easier interpretation of lower values of drugs seized."}
temp <-
property %>%
filter(!is.na(estimated_quantity_1),
type_of_measurement_1 %in% "gram",
suspected_drug_type_1 %in% "marijuana") %>%
mutate(estimated_quantity_1 = plyr::round_any(estimated_quantity_1, 1))
temp$estimated_quantity_1[temp$estimated_quantity_1 > 99] <- "101+"
temp$estimated_quantity_1[temp$estimated_quantity_1 %in% 0] <- "<1"
temp$estimated_quantity_1[temp$estimated_quantity_1 %in% 10:25] <- "10-25"
temp$estimated_quantity_1[temp$estimated_quantity_1 %in% 26:50] <- "26-50"
temp$estimated_quantity_1[temp$estimated_quantity_1 %in% 51:100] <- "50-100"
temp <-
temp %>%
group_by(estimated_quantity_1) %>%
count()
temp$estimated_quantity_1 <- factor(temp$estimated_quantity_1, rev(c("<1", 1:9, "10-25", "26-50", "50-100", "101+")))
temp$proportion <- temp$n / sum(temp$n)
ggplot(temp, aes(x = estimated_quantity_1, y = proportion)) +
geom_col() +
coord_flip() +
ggplot2::scale_y_continuous(labels = scales::percent, expand = c(0, 0), limits = c(0, NA)) +
ylab("% of Drugs") +
xlab("# of Grams") +
theme_crim()
```