vignettes/apply-models.Rmd

---
title: "Apply Models"
subtitle: "Generate spatial predictions across a landscape"
author: "Hong Jhun Sim"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Apply Models}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

This article outlines the steps to predict the number of species (species richness) for a particular animal group, across a target landscape. The pixel-based predictions are then visualised in the form of a spatial heatmap. 

<br>

```{r out.width = "70%", fig.align='center', dpi = 300, echo = FALSE}
knitr::include_graphics("framework_apply-models.png", dpi = 300)
```
<center><b> Figure: Broad overview of the data workflow for a chosen animal group </b></center>

<br>

Begin by loading the required packages to run the analysis:

```{r load required libraries, eval = FALSE, warning = FALSE, message = FALSE}
library("biodivercity")
library("dplyr") # to process/wrangle data
library("tmap") # for visualisation
```

```{r load libraries while in dev, include = FALSE}
devtools::load_all() 
library("dplyr") 
library("tmap")
```

The `biodivercity` package contains predictive models (`lme4::glmer()`) and data pre-processing workflow 'recipes' (`recipes::recipe()`) that were built for each of four animal groups surveyed in Singapore. Refer to `vignette("build-models")` for more details on these models, which were built using the example data in this package. In this article, the model/recipe used to predict the number of bird species will be used as an example. Alternatively, the user may provide their own model/recipe for the analysis. 

Load the model/recipe objects built exclusively from manually-generated landscape data:

```{r load model objects}
filepath <- system.file("extdata", "models-manually-mapped.Rdata", package = "biodivercity")
load(filepath)
```

Landscape data within the target area of interest will be used to make spatial predictions. They will be summarised according to the predictor variables present in the models. For example, predictor variables in the bird model can be extracted as follows. Note that the prefix `r<value>m` denotes the buffer radius that the particular landscape data was summarised within, and `_man_` denotes that the variables were generated from manually-generated data. 

```{r}
predictors_birds <- models_birds %>% # a list object
 lapply(function(x) names(x@frame)) %>% # extract variable names in model
  unlist() %>%
  unique() %>% 
  stringr::str_subset("(?<=^r)\\d+.*") # predictor variables start with the "r<value>m_"

predictors_birds
```

The following list describes the manually-generated landscape components used to build models provided in this package, their respective vector format, as well as all the possible predictor variables that may be summarised:

- Natural vegetation (polygons)
    - Percentage of landscape area (`natveg_pland`)
    
- Trees (points) with species name (per point)
    - Species richness (`tree_sprich`)
    
- Shrubs (polygons) with species name (per polygon)
    - Percentage of landscape area (`shrub_pland`)
    - Species richness (`shrub_sprich`)
    
- Turf (polygons)
    - Percentage of landscape area (`turf_pland`)
    
- Water (polygons)
    - Percentage of landscape area (`water_pland`)
    
- Buildings (polygons) each with the number of levels
    - Floor area ratio (`buildingFA_ratio`)    
    - Average number of levels (`buildingAvgLvl`)

- Roads (lines) each with the number of lanes 
    - Lane density (`laneDensity`)

<br>

To demonstrate how spatial predictions may be generated, load the example landscape data from within the Punggol (PG) area in Singapore (Chong et al., 2014, 2019), visualised in the interactive map below. Note that there are no water bodies mapped in the target area.

```{r load landscape data}
filepath <- system.file("extdata", "pg_layers.Rdata", package = "biodivercity")
load(filepath)
```

```{r plot interative map for landscape data, echo = FALSE, warning = FALSE, message = FALSE, fig.width=2.6, fig.height = 2.0, dpi = 300, out.width="100%"}

tmap_mode("view")
tmap_options(check.and.fix = TRUE)

tm_basemap(c("CartoDB.Positron")) +
  tm_shape(bound) +
    tm_borders() +
  tm_shape(natveg) +
    tm_polygons(title = "Natural vegetation",
                group = "natveg",
                col = "darkgreen",
                alpha = 0.6,
                border.col = "transparent") +
  tm_shape(trees %>% relocate(species)) +
    tm_dots(title = "Trees",
            group = "trees",
            col = "brown",
            size = 0.001,
            border.col = "transparent") +
  tm_shape(shrubs %>% relocate(species) %>% st_make_valid()) +
    tm_polygons(title = "Shrubs",
                group = "shrubs",
                col = "#ffd92f",
                alpha = 0.6,
                border.col = "transparent") +
  tm_shape(turf) +
    tm_polygons(title = "Turf",
                group = "turf",
                col = "#a6d854",
                alpha = 0.6,
                border.col = "transparent") +
  # tm_shape(water) +
  #   tm_polygons(title = "Water",
  #               group = "water",
  #               col = "blue",
  #               alpha = 0.6,
  #               border.col = "transparent") +
  tm_shape(buildings) +
    tm_polygons(title = "Building levels",
                group = "buildings",
                col = "levels",
                palette = viridis::magma(5),
                style = "jenks",
                border.col = "transparent") + 
  tm_shape(roads) +
    tm_lines(title.col = "Road lanes",
             group = "roads",
             col = "lanes",
             palette = viridis::magma(5),
             style = "fixed",
             breaks = c(2, 4, 6, 8))

```

<br>

## Spatial predictions

To make spatial predictions, the target area is broken up into many smaller points (pixels), where landscape data will be summarised and predictions will be made. First, the function `generate_grid()` will be used to generate a grid over the target area (see Figure 1). The pixel resolution of the grid can be specified with the argument `pixelsize_m`. 

```{r out.width = "70%", fig.align='center', dpi = 300, echo = FALSE, fig.cap = "Figure 1: Visualisation of the function `generate_grid()`."}
knitr::include_graphics("generate_grid.jpg", dpi = 300)
```

An optional argument `innerbuffer_m` is also provided. This limits output predictions to a smaller area within the target landscape (Figure 1). This is useful, for example, when the model requires broad-scale landscape data to make predictions, but landscape data that is available do not extend beyond the target area. Hence, to avoid inaccurate predictions that result from areas without landscape data, the distance value for this argument should correspond to the largest buffer radius present in the model variables. In this example, the largest radius for variables within the bird models is 126 metres:

```{r}

# get the max radius among all predictor variables
max_radius <- predictors_birds %>% 
  stringr::str_extract("(?<=^r)\\d+") %>%  # extract values for buffer radii
  as.numeric() %>% 
  max(na.rm = TRUE)
  
max_radius
```

With the `max_radius` defined, run the function `generate_grid()`. The output is a dataframe of points (see Figure 1), where landscape variables will be summarised within their respective distance buffers. In this example, we will generate a grid with a pixel resolution of 50 metres.

```{r create grid for each UGS town}
grid_points <- generate_grid(target_areas = bound, # target area
                             innerbuffer_m = max_radius, # exclude areas < 126 m from boundaries
                             pixelsize_m = 50) %>%  # pixel resolution
  rownames_to_column("point_id") # add unique identifier

grid_points # geometry column has been added
```

Next, use the function `calc_manual()` to summarise each landscape component around each of the point locations in `grid_points`, and at specified buffer radii. For example, to use the bird models, we summarise the landscape data as follows. The predictor variables can be appended to the `grid_points` dataframe as new columns, corresponding to those present in  `predictors_birds`.

```{r grid landscape, warning = FALSE, message = FALSE}

predictors_buildings <- 
  calc_manual(vector = buildings, name = "buildings",
              points = grid_points, buffer_sizes = 50,
              building_levels = "levels") %>% 
   lapply(st_drop_geometry) %>% 
  bind_rows(.id = "radius_m") %>% 
  pivot_wider(names_from = "radius_m", 
              values_from = starts_with("man"),
              names_glue = "r{radius_m}m_{.value}")

predictors_roads <- 
  calc_manual(vector = roads, name = "roads",
              points = grid_points, buffer_sizes = 50,
              road_lanes = "lanes") %>% 
  lapply(st_drop_geometry) %>% 
  bind_rows(.id = "radius_m") %>% 
  pivot_wider(names_from = "radius_m", 
              values_from = starts_with("man"),
              names_glue = "r{radius_m}m_{.value}")

predictors_trees <- 
  calc_manual(vector = trees, name = "trees",
              points = grid_points, buffer_sizes = 50,
              plant_species = "species") %>% 
  lapply(st_drop_geometry) %>% 
  bind_rows(.id = "radius_m") %>% 
  pivot_wider(names_from = "radius_m", 
              values_from = starts_with("man"),
              names_glue = "r{radius_m}m_{.value}")

predictors_shrubs <- 
  calc_manual(vector = shrubs, name = "shrubs",
              points = grid_points, buffer_sizes = 50,
              plant_species = "species") %>% 
  lapply(st_drop_geometry) %>% 
  bind_rows(.id = "radius_m") %>% 
  pivot_wider(names_from = "radius_m", 
              values_from = starts_with("man"),
              names_glue = "r{radius_m}m_{.value}")

predictors_turf <- 
  calc_manual(vector = turf, name = "turf", 
              points = grid_points, buffer_sizes = 50) %>% 
  lapply(st_drop_geometry) %>% 
  bind_rows(.id = "radius_m") %>% 
  pivot_wider(names_from = "radius_m", 
              values_from = starts_with("man"),
              names_glue = "r{radius_m}m_{.value}")

predictors_natveg <- 
  calc_manual(vector = natveg, name = "natveg",
              points = grid_points, buffer_sizes = c(50, 126)) %>% 
  lapply(st_drop_geometry) %>% 
  bind_rows(.id = "radius_m") %>% 
  pivot_wider(names_from = "radius_m", 
              values_from = starts_with("man"),
              names_glue = "r{radius_m}m_{.value}")

predictors_water <- 
  calc_manual(vector = water, name = "water",
              points = grid_points, buffer_sizes = 50) %>% 
  lapply(st_drop_geometry) %>% 
  bind_rows(.id = "radius_m") %>% 
  pivot_wider(names_from = "radius_m", 
              values_from = starts_with("man"),
              names_glue = "r{radius_m}m_{.value}")


# combine all landscape predictors
grid_landscape <- grid_points %>% 
  inner_join(predictors_buildings) %>% 
  inner_join(predictors_roads) %>% 
  inner_join(predictors_trees) %>% 
  inner_join(predictors_shrubs) %>% 
  inner_join(predictors_turf) %>% 
  inner_join(predictors_natveg) %>% 
  inner_join(predictors_water)

grid_landscape
```


Finally, use the function `predict_heatmap()` to predict the number of bird species (species richness) across the generated grid of spatial points, based on the predictor variables summarised in `grid_landscape`. The model object `models_birds` (suite of 'best' models) and `recipe_birds` (data pre-processing workflow) will be used to make the predictions at each point. Ensure that the argument `pixelsize_m` is similar to the value used in `generate_grid()`.

```{r heatmap_raster}
bird_heatmap <- predict_heatmap(models = models_birds, 
                                recipe_data = recipe_birds,
                                points_topredict = grid_landscape, 
                                pixelsize_m = 50)
```

<br>

## Visualisation

The continuous raster may be visualised as a heatmap:

```{r plot heatmap, message = FALSE, warning = FALSE, fig.width=2.6, fig.height = 2.0, dpi = 300, out.width="100%"}

tmap_mode("view")
tmap_options(max.raster = c(view = 1e8)) # increase max resolution to be visualised

tm_basemap(c("CartoDB.Positron", "OpenStreetMap")) +
  tm_shape(bird_heatmap, raster.downsample = FALSE) +
  tm_raster(title = "Number of bird species",
            style = "pretty",
            palette = "YlOrRd",
            alpha = 0.6) 
  
```

<br>

---

## References

Chong KY, Teo S, Kurukulasuriya B, Chung YF, Giam X & Tan HTW (2019) The effects of landscape scale on greenery and traffic relationships with urban birds and butterflies. _Urban Ecosystems_, _22_(5): 917–926.

Chong KY, Teo S, Kurukulasuriya B, Chung YF, Rajathurai S & Tan HTW (2014) Not all green is as good: Different effects of the natural and cultivated components of urban vegetation on bird and butterfly diversity. _Biological Conservation_, _171_: 299–309.