-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathapply-models.Rmd
305 lines (239 loc) · 12.6 KB
/
apply-models.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
---
title: "Apply Models"
subtitle: "Generate spatial predictions across a landscape"
author: "Hong Jhun Sim"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Apply Models}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
This article outlines the steps to predict the number of species (species richness) for a particular animal group, across a target landscape. The pixel-based predictions are then visualised in the form of a spatial heatmap.
<br>
```{r out.width = "70%", fig.align='center', dpi = 300, echo = FALSE}
knitr::include_graphics("framework_apply-models.png", dpi = 300)
```
<center><b> Figure: Broad overview of the data workflow for a chosen animal group </b></center>
<br>
Begin by loading the required packages to run the analysis:
```{r load required libraries, eval = FALSE, warning = FALSE, message = FALSE}
library("biodivercity")
library("dplyr") # to process/wrangle data
library("tmap") # for visualisation
```
```{r load libraries while in dev, include = FALSE}
devtools::load_all()
library("dplyr")
library("tmap")
```
The `biodivercity` package contains predictive models (`lme4::glmer()`) and data pre-processing workflow 'recipes' (`recipes::recipe()`) that were built for each of four animal groups surveyed in Singapore. Refer to `vignette("build-models")` for more details on these models, which were built using the example data in this package. In this article, the model/recipe used to predict the number of bird species will be used as an example. Alternatively, the user may provide their own model/recipe for the analysis.
Load the model/recipe objects built exclusively from manually-generated landscape data:
```{r load model objects}
filepath <- system.file("extdata", "models-manually-mapped.Rdata", package = "biodivercity")
load(filepath)
```
Landscape data within the target area of interest will be used to make spatial predictions. They will be summarised according to the predictor variables present in the models. For example, predictor variables in the bird model can be extracted as follows. Note that the prefix `r<value>m` denotes the buffer radius that the particular landscape data was summarised within, and `_man_` denotes that the variables were generated from manually-generated data.
```{r}
predictors_birds <- models_birds %>% # a list object
lapply(function(x) names(x@frame)) %>% # extract variable names in model
unlist() %>%
unique() %>%
stringr::str_subset("(?<=^r)\\d+.*") # predictor variables start with the "r<value>m_"
predictors_birds
```
The following list describes the manually-generated landscape components used to build models provided in this package, their respective vector format, as well as all the possible predictor variables that may be summarised:
- Natural vegetation (polygons)
- Percentage of landscape area (`natveg_pland`)
- Trees (points) with species name (per point)
- Species richness (`tree_sprich`)
- Shrubs (polygons) with species name (per polygon)
- Percentage of landscape area (`shrub_pland`)
- Species richness (`shrub_sprich`)
- Turf (polygons)
- Percentage of landscape area (`turf_pland`)
- Water (polygons)
- Percentage of landscape area (`water_pland`)
- Buildings (polygons) each with the number of levels
- Floor area ratio (`buildingFA_ratio`)
- Average number of levels (`buildingAvgLvl`)
- Roads (lines) each with the number of lanes
- Lane density (`laneDensity`)
<br>
To demonstrate how spatial predictions may be generated, load the example landscape data from within the Punggol (PG) area in Singapore (Chong et al., 2014, 2019), visualised in the interactive map below. Note that there are no water bodies mapped in the target area.
```{r load landscape data}
filepath <- system.file("extdata", "pg_layers.Rdata", package = "biodivercity")
load(filepath)
```
```{r plot interative map for landscape data, echo = FALSE, warning = FALSE, message = FALSE, fig.width=2.6, fig.height = 2.0, dpi = 300, out.width="100%"}
tmap_mode("view")
tmap_options(check.and.fix = TRUE)
tm_basemap(c("CartoDB.Positron")) +
tm_shape(bound) +
tm_borders() +
tm_shape(natveg) +
tm_polygons(title = "Natural vegetation",
group = "natveg",
col = "darkgreen",
alpha = 0.6,
border.col = "transparent") +
tm_shape(trees %>% relocate(species)) +
tm_dots(title = "Trees",
group = "trees",
col = "brown",
size = 0.001,
border.col = "transparent") +
tm_shape(shrubs %>% relocate(species) %>% st_make_valid()) +
tm_polygons(title = "Shrubs",
group = "shrubs",
col = "#ffd92f",
alpha = 0.6,
border.col = "transparent") +
tm_shape(turf) +
tm_polygons(title = "Turf",
group = "turf",
col = "#a6d854",
alpha = 0.6,
border.col = "transparent") +
# tm_shape(water) +
# tm_polygons(title = "Water",
# group = "water",
# col = "blue",
# alpha = 0.6,
# border.col = "transparent") +
tm_shape(buildings) +
tm_polygons(title = "Building levels",
group = "buildings",
col = "levels",
palette = viridis::magma(5),
style = "jenks",
border.col = "transparent") +
tm_shape(roads) +
tm_lines(title.col = "Road lanes",
group = "roads",
col = "lanes",
palette = viridis::magma(5),
style = "fixed",
breaks = c(2, 4, 6, 8))
```
<br>
## Spatial predictions
To make spatial predictions, the target area is broken up into many smaller points (pixels), where landscape data will be summarised and predictions will be made. First, the function `generate_grid()` will be used to generate a grid over the target area (see Figure 1). The pixel resolution of the grid can be specified with the argument `pixelsize_m`.
```{r out.width = "70%", fig.align='center', dpi = 300, echo = FALSE, fig.cap = "Figure 1: Visualisation of the function `generate_grid()`."}
knitr::include_graphics("generate_grid.jpg", dpi = 300)
```
An optional argument `innerbuffer_m` is also provided. This limits output predictions to a smaller area within the target landscape (Figure 1). This is useful, for example, when the model requires broad-scale landscape data to make predictions, but landscape data that is available do not extend beyond the target area. Hence, to avoid inaccurate predictions that result from areas without landscape data, the distance value for this argument should correspond to the largest buffer radius present in the model variables. In this example, the largest radius for variables within the bird models is 126 metres:
```{r}
# get the max radius among all predictor variables
max_radius <- predictors_birds %>%
stringr::str_extract("(?<=^r)\\d+") %>% # extract values for buffer radii
as.numeric() %>%
max(na.rm = TRUE)
max_radius
```
With the `max_radius` defined, run the function `generate_grid()`. The output is a dataframe of points (see Figure 1), where landscape variables will be summarised within their respective distance buffers. In this example, we will generate a grid with a pixel resolution of 50 metres.
```{r create grid for each UGS town}
grid_points <- generate_grid(target_areas = bound, # target area
innerbuffer_m = max_radius, # exclude areas < 126 m from boundaries
pixelsize_m = 50) %>% # pixel resolution
rownames_to_column("point_id") # add unique identifier
grid_points # geometry column has been added
```
Next, use the function `calc_manual()` to summarise each landscape component around each of the point locations in `grid_points`, and at specified buffer radii. For example, to use the bird models, we summarise the landscape data as follows. The predictor variables can be appended to the `grid_points` dataframe as new columns, corresponding to those present in `predictors_birds`.
```{r grid landscape, warning = FALSE, message = FALSE}
predictors_buildings <-
calc_manual(vector = buildings, name = "buildings",
points = grid_points, buffer_sizes = 50,
building_levels = "levels") %>%
lapply(st_drop_geometry) %>%
bind_rows(.id = "radius_m") %>%
pivot_wider(names_from = "radius_m",
values_from = starts_with("man"),
names_glue = "r{radius_m}m_{.value}")
predictors_roads <-
calc_manual(vector = roads, name = "roads",
points = grid_points, buffer_sizes = 50,
road_lanes = "lanes") %>%
lapply(st_drop_geometry) %>%
bind_rows(.id = "radius_m") %>%
pivot_wider(names_from = "radius_m",
values_from = starts_with("man"),
names_glue = "r{radius_m}m_{.value}")
predictors_trees <-
calc_manual(vector = trees, name = "trees",
points = grid_points, buffer_sizes = 50,
plant_species = "species") %>%
lapply(st_drop_geometry) %>%
bind_rows(.id = "radius_m") %>%
pivot_wider(names_from = "radius_m",
values_from = starts_with("man"),
names_glue = "r{radius_m}m_{.value}")
predictors_shrubs <-
calc_manual(vector = shrubs, name = "shrubs",
points = grid_points, buffer_sizes = 50,
plant_species = "species") %>%
lapply(st_drop_geometry) %>%
bind_rows(.id = "radius_m") %>%
pivot_wider(names_from = "radius_m",
values_from = starts_with("man"),
names_glue = "r{radius_m}m_{.value}")
predictors_turf <-
calc_manual(vector = turf, name = "turf",
points = grid_points, buffer_sizes = 50) %>%
lapply(st_drop_geometry) %>%
bind_rows(.id = "radius_m") %>%
pivot_wider(names_from = "radius_m",
values_from = starts_with("man"),
names_glue = "r{radius_m}m_{.value}")
predictors_natveg <-
calc_manual(vector = natveg, name = "natveg",
points = grid_points, buffer_sizes = c(50, 126)) %>%
lapply(st_drop_geometry) %>%
bind_rows(.id = "radius_m") %>%
pivot_wider(names_from = "radius_m",
values_from = starts_with("man"),
names_glue = "r{radius_m}m_{.value}")
predictors_water <-
calc_manual(vector = water, name = "water",
points = grid_points, buffer_sizes = 50) %>%
lapply(st_drop_geometry) %>%
bind_rows(.id = "radius_m") %>%
pivot_wider(names_from = "radius_m",
values_from = starts_with("man"),
names_glue = "r{radius_m}m_{.value}")
# combine all landscape predictors
grid_landscape <- grid_points %>%
inner_join(predictors_buildings) %>%
inner_join(predictors_roads) %>%
inner_join(predictors_trees) %>%
inner_join(predictors_shrubs) %>%
inner_join(predictors_turf) %>%
inner_join(predictors_natveg) %>%
inner_join(predictors_water)
grid_landscape
```
Finally, use the function `predict_heatmap()` to predict the number of bird species (species richness) across the generated grid of spatial points, based on the predictor variables summarised in `grid_landscape`. The model object `models_birds` (suite of 'best' models) and `recipe_birds` (data pre-processing workflow) will be used to make the predictions at each point. Ensure that the argument `pixelsize_m` is similar to the value used in `generate_grid()`.
```{r heatmap_raster}
bird_heatmap <- predict_heatmap(models = models_birds,
recipe_data = recipe_birds,
points_topredict = grid_landscape,
pixelsize_m = 50)
```
<br>
## Visualisation
The continuous raster may be visualised as a heatmap:
```{r plot heatmap, message = FALSE, warning = FALSE, fig.width=2.6, fig.height = 2.0, dpi = 300, out.width="100%"}
tmap_mode("view")
tmap_options(max.raster = c(view = 1e8)) # increase max resolution to be visualised
tm_basemap(c("CartoDB.Positron", "OpenStreetMap")) +
tm_shape(bird_heatmap, raster.downsample = FALSE) +
tm_raster(title = "Number of bird species",
style = "pretty",
palette = "YlOrRd",
alpha = 0.6)
```
<br>
---
## References
Chong KY, Teo S, Kurukulasuriya B, Chung YF, Giam X & Tan HTW (2019) The effects of landscape scale on greenery and traffic relationships with urban birds and butterflies. _Urban Ecosystems_, _22_(5): 917–926.
Chong KY, Teo S, Kurukulasuriya B, Chung YF, Rajathurai S & Tan HTW (2014) Not all green is as good: Different effects of the natural and cultivated components of urban vegetation on bird and butterfly diversity. _Biological Conservation_, _171_: 299–309.