diff --git a/previews/PR65/index.html b/previews/PR65/index.html index a276d1c..e924020 100644 --- a/previews/PR65/index.html +++ b/previews/PR65/index.html @@ -533,19 +533,19 @@

High-level types

BONSampler
 

A union of the abstract types BONSeeder and BONRefiner. Both types return a tuple with the coordinates as a vector of CartesianIndex, and the weight matrix as a Matrix of AbstractFloat, in that order.

-

source

+

source

# BiodiversityObservationNetworks.BONSeederType.

abstract type BONSeeder end
 

A BONSeeder is an algorithm for proposing sampling locations using a raster of weights, represented as a matrix, in each cell.

-

source

+

source

# BiodiversityObservationNetworks.BONRefinerType.

abstract type BONRefiner end
 

A BONRefiner is an algorithm for proposing sampling locations by refining a set of candidate points to a smaller set of 'best' points.

-

source

+

source

Seeder and refiner functions

@@ -554,11 +554,11 @@

Seeder and refiner functions

seed(sampler::ST, uncertainty::Matrix{T})
 

Produces a set of candidate sampling locations in a vector coords of length numpoints from a raster uncertainty using sampler, where sampler is a BONSeeder.

-

source

+

source

seed!(coords::Vector{CartesianIndex}, sampler::ST)
 

The curried version of seed!, which returns a function that acts on the input uncertainty layer passed to the curried function (u below).

-

source

+

source

# BiodiversityObservationNetworks.seed!Function.

seed!(coords::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})
@@ -567,35 +567,35 @@ 

Seeder and refiner functions

  • Seeder's work on rasters, refiners work on set of coordinates.
-

source

+

source

seed!(coords::Vector{CartesianIndex}, sampler::ST)
 

The curried version of seed!, which returns a function that acts on the input uncertainty layer passed to the curried function (u below).

-

source

+

source

# BiodiversityObservationNetworks.refineFunction.

refine(pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})
 

Refines a set of candidate sampling locations and returns a vector coords of length numpoints from a vector of coordinates pool using sampler, where sampler is a BONRefiner.

-

source

+

source

refine(sampler::BONRefiner)
 

Returns a curried function of refine with two methods: both are using the output of seed, one in its packed form, the other in its splatted form.

-

source

+

source

refine(pack, sampler::BONRefiner)
 

Calls refine on the appropriatedly splatted version of pack.

-

source

+

source

# BiodiversityObservationNetworks.refine!Function.

refine!(cooords::Vector{CartesianIndex}, pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})
 

Refines a set of candidate sampling locations in the preallocated vector coords from a vector of coordinates pool using sampler, where sampler is a BONRefiner.

-

source

+

source

refine!(cooords::Vector{CartesianIndex}, pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})
 

The curried version of refine!, which returns a function that acts on the input coordinate pool passed to the curried function (p below).

-

source

+

source

Seeder algorithms

@@ -604,7 +604,7 @@

Seeder algorithms

BalancedAcceptance
 

A BONSeeder that uses Balanced-Acceptance Sampling (Van-dem-Bates et al. 2017 https://doi.org/10.1111/2041-210X.13003)

-

source

+

source

Refiner algorithms

@@ -615,13 +615,13 @@

Refiner algorithms

...

numpoints, an Integer (def. 50), specifying the number of points to use.

α, an AbstractFloat (def. 1.0), specifying ...

-

source

+

source

# BiodiversityObservationNetworks.UniquenessType.

Uniqueness
 

A BONRefiner

-

source

+

source

Helper functions

@@ -641,20 +641,20 @@

Helper functions

For each location, the value of the condensed layer tᵢ, corresponding to target i, at coordinate (i,j) is given by the dot product of v⃗ᵢⱼ and the i-th column of W.

(2): Apply a weighted average across each target layer. To produce the final output layer, we apply a weighted average to each target layer, where the weights are provided in the vector α⃗ of length m.

The final value of the squished layer at (i,j) is given by s⃗ᵢⱼ = ∑ₓ αₓ*tᵢⱼ(x), where tᵢⱼ(x) is the value of the x-th target layer at (i,j).

-

source

+

source

# BiodiversityObservationNetworks.entropize!Function.

entropize!(U::Matrix{AbstractFloat}, A::Matrix{Number})
 

This function turns a matrix A (storing measurement values) into pixel-wise entropy values, stored in a matrix U (that is previously allocated).

Pixel-wise entropy is determined by measuring the empirical probability of randomly picking a value in the matrix that is either lower or higher than the pixel value. The entropy of both these probabilities are calculated using the -p×log(2,p) formula. The entropy of the pixel is the sum of the two entropies, so that it is close to 1 for values close to the median, and close to 0 for values close to the extreme of the distribution.

-

source

+

source

# BiodiversityObservationNetworks.entropizeFunction.

entropize(A::Matrix{Number})
 

Allocation version of entropize!.

-

source

+

source

diff --git a/previews/PR65/search/search_index.json b/previews/PR65/search/search_index.json index df6fce4..263b606 100644 --- a/previews/PR65/search/search_index.json +++ b/previews/PR65/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":""},{"location":"#biodiversityobservationnetworksjl","title":"BiodiversityObservationNetworks.jl","text":"

The purpose of this package is to provide a high-level, extensible, modular interface to the selection of sampling point for biodiversity processes in space. It is based around a collection of types representing point selection algorithms, used to select the most informative sampling points based on raster data. Specifically, many algorithms work from a layer indicating entropy of a model based prediction at each location.

This package is in development

The BiodiversityObservationNetworks.jl package is currently under development. The API is not expected to change a lot, but it may change in order to facilitate the integration of new features.

"},{"location":"#high-level-types","title":"High-level types","text":"

# BiodiversityObservationNetworks.BONSampler \u2014 Type.

BONSampler\n

A union of the abstract types BONSeeder and BONRefiner. Both types return a tuple with the coordinates as a vector of CartesianIndex, and the weight matrix as a Matrix of AbstractFloat, in that order.

source

# BiodiversityObservationNetworks.BONSeeder \u2014 Type.

abstract type BONSeeder end\n

A BONSeeder is an algorithm for proposing sampling locations using a raster of weights, represented as a matrix, in each cell.

source

# BiodiversityObservationNetworks.BONRefiner \u2014 Type.

abstract type BONRefiner end\n

A BONRefiner is an algorithm for proposing sampling locations by refining a set of candidate points to a smaller set of 'best' points.

source

"},{"location":"#seeder-and-refiner-functions","title":"Seeder and refiner functions","text":"

# BiodiversityObservationNetworks.seed \u2014 Function.

seed(sampler::ST, uncertainty::Matrix{T})\n

Produces a set of candidate sampling locations in a vector coords of length numpoints from a raster uncertainty using sampler, where sampler is a BONSeeder.

source

seed!(coords::Vector{CartesianIndex}, sampler::ST)\n

The curried version of seed!, which returns a function that acts on the input uncertainty layer passed to the curried function (u below).

source

# BiodiversityObservationNetworks.seed! \u2014 Function.

seed!(coords::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})\n

Puts a set of candidate sampling locations in the preallocated vector coords from a raster uncertainty using sampler, where sampler is a BONSeeder.

  • Seeder's work on rasters, refiners work on set of coordinates.

source

seed!(coords::Vector{CartesianIndex}, sampler::ST)\n

The curried version of seed!, which returns a function that acts on the input uncertainty layer passed to the curried function (u below).

source

# BiodiversityObservationNetworks.refine \u2014 Function.

refine(pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})\n

Refines a set of candidate sampling locations and returns a vector coords of length numpoints from a vector of coordinates pool using sampler, where sampler is a BONRefiner.

source

refine(sampler::BONRefiner)\n

Returns a curried function of refine with two methods: both are using the output of seed, one in its packed form, the other in its splatted form.

source

refine(pack, sampler::BONRefiner)\n

Calls refine on the appropriatedly splatted version of pack.

source

# BiodiversityObservationNetworks.refine! \u2014 Function.

refine!(cooords::Vector{CartesianIndex}, pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})\n

Refines a set of candidate sampling locations in the preallocated vector coords from a vector of coordinates pool using sampler, where sampler is a BONRefiner.

source

refine!(cooords::Vector{CartesianIndex}, pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})\n

The curried version of refine!, which returns a function that acts on the input coordinate pool passed to the curried function (p below).

source

"},{"location":"#seeder-algorithms","title":"Seeder algorithms","text":"

# BiodiversityObservationNetworks.BalancedAcceptance \u2014 Type.

BalancedAcceptance\n

A BONSeeder that uses Balanced-Acceptance Sampling (Van-dem-Bates et al. 2017 https://doi.org/10.1111/2041-210X.13003)

source

"},{"location":"#refiner-algorithms","title":"Refiner algorithms","text":"

# BiodiversityObservationNetworks.AdaptiveSpatial \u2014 Type.

AdaptiveSpatial\n

...

numpoints, an Integer (def. 50), specifying the number of points to use.

\u03b1, an AbstractFloat (def. 1.0), specifying ...

source

# BiodiversityObservationNetworks.Uniqueness \u2014 Type.

Uniqueness\n

A BONRefiner

source

"},{"location":"#helper-functions","title":"Helper functions","text":"

# BiodiversityObservationNetworks.squish \u2014 Function.

squish(layers, W, \u03b1)\n

Takes a set of n layers and squishes them down to a single layer.

    numcolumns = size(W,2)\n    for i in 1:numcolumns\n        W[:,i] ./= sum(W[:,i])\n    end\n

For a coordinate in the raster (i,j), denote the vector of values across all locations at that coordinate v\u20d7\u1d62\u2c7c. The value at that coordinate in squished layer, s\u20d7\u1d62\u2c7c, is computed in two steps.

(1): First we apply a weights matrix, W``, withnrows andmcolumns (m<n), to reduce the initialnlayers down to a set ofm` layers, each of which corresponds to a particular target of optimization. For example, we may want to propose sampling locations that are optimized to best sample abalance multiple criteria, like (a) the current distribution of a species and (b) if that distribution is changing over time.

Each entry in the weights matrix W corresponds to the 'importance' of the layer in the corresponding row to the successful measurement of the target of the corresponding column. As such, each column of W must sum to 1.0. using Optim

For each location, the value of the condensed layer t\u1d62, corresponding to target i, at coordinate (i,j) is given by the dot product of v\u20d7\u1d62\u2c7c and the i-th column of W.

(2): Apply a weighted average across each target layer. To produce the final output layer, we apply a weighted average to each target layer, where the weights are provided in the vector \u03b1\u20d7 of length m.

The final value of the squished layer at (i,j) is given by s\u20d7\u1d62\u2c7c = \u2211\u2093 \u03b1\u2093*t\u1d62\u2c7c(x), where t\u1d62\u2c7c(x) is the value of the x-th target layer at (i,j).

source

# BiodiversityObservationNetworks.entropize! \u2014 Function.

entropize!(U::Matrix{AbstractFloat}, A::Matrix{Number})\n

This function turns a matrix A (storing measurement values) into pixel-wise entropy values, stored in a matrix U (that is previously allocated).

Pixel-wise entropy is determined by measuring the empirical probability of randomly picking a value in the matrix that is either lower or higher than the pixel value. The entropy of both these probabilities are calculated using the -p\u00d7log(2,p) formula. The entropy of the pixel is the sum of the two entropies, so that it is close to 1 for values close to the median, and close to 0 for values close to the extreme of the distribution.

source

# BiodiversityObservationNetworks.entropize \u2014 Function.

entropize(A::Matrix{Number})\n

Allocation version of entropize!.

source

"},{"location":"bibliography/","title":"Bibliography","text":""},{"location":"bibliography/#references","title":"References","text":""},{"location":"vignettes/entropize/","title":"Entropize","text":""},{"location":"vignettes/entropize/#getting-the-entropy-matrix","title":"Getting the entropy matrix","text":"

For some applications, we want to place points to capture the maximum amount of information, which is to say that we want to sample a balance of entropy values, as opposed to absolute values. In this vignette, we will walk through an example using the entropize function to convert raw data to entropy values.

using BiodiversityObservationNetworks\nusing NeutralLandscapes\nusing CairoMakie\n

Entropy is problem-specific

The solution presented in this vignette is a least-assumption solution based on the empirical values given in a matrix of measurements. In a lot of situations, this is not the entropy that you want. For example, if your pixels are storing probabilities of Bernoulli events, you can directly use the entropy of the events in the entropy matrix.

We start by generating a random matrix of measurements:

measurements = rand(MidpointDisplacement(), (200, 200)) .* 100\nheatmap(measurements)\n

Using the entropize function will convert these values into entropy at the pixel scale:

U = entropize(measurements)\nheatmap(U)\n

The values closest to the median of the distribution have the highest entropy, and the values closest to its extrema have an entropy of 0. The entropy matrix is guaranteed to have values on the unit interval.

We can use entropize as part of a pipeline, and overlay the points optimized based on entropy on the measurement map:

locations =\n    measurements |> entropize |> seed(BalancedAcceptance(; numpoints = 100)) |> first\nheatmap(U)\n

"},{"location":"vignettes/overview/","title":"Overview","text":""},{"location":"vignettes/overview/#an-introduction-to-biodiversityobservationnetworks","title":"An introduction to BiodiversityObservationNetworks","text":"

In this vignette, we will walk through the basic functionalities of the package, by generating a random uncertainty matrix, and then using a seeder and a refiner to decide which locations should be sampled in order to gain more insights about the process generating this entropy.

using BiodiversityObservationNetworks\nusing NeutralLandscapes\nusing CairoMakie\n

In order to simplify the process, we will use the NeutralLandscapes package to generate a 100\u00d7100 pixels landscape, where each cell represents the entropy (or information content) in a unit we can sample:

U = rand(MidpointDisplacement(0.5), (100, 100))\nheatmap(U)\n

In practice, this uncertainty matrix is likely to be derived from an application of the hyper-parameters optimization step, which is detailed in other vignettes.

The first step of defining a series of locations to sample is to use a BONSeeder, which will generate a number of relatively coarse proposals that cover the entire landscape, and have a balanced distribution in space. We do so using the BalancedAcceptance sampler, which can be tweaked to capture more (or less) uncertainty. To start with, we will extract 200 candidate points, i.e. 200 possible locations which will then be refined.

pack = seed(BalancedAcceptance(; numpoints = 200), U);\n
(CartesianIndex[CartesianIndex(81, 10), CartesianIndex(19, 44), CartesianIndex(69, 77), CartesianIndex(44, 21), CartesianIndex(94, 55), CartesianIndex(12, 88), CartesianIndex(62, 33), CartesianIndex(37, 66), CartesianIndex(87, 99), CartesianIndex(25, 4)  \u2026  CartesianIndex(55, 44), CartesianIndex(30, 77), CartesianIndex(80, 22), CartesianIndex(18, 55), CartesianIndex(68, 89), CartesianIndex(43, 33), CartesianIndex(93, 66), CartesianIndex(12, 100), CartesianIndex(62, 1), CartesianIndex(37, 35)], [0.6457023076171001 0.6530873362023615 \u2026 0.4802747697941122 0.495825668828999; 0.6662755323284345 0.6646109378460477 \u2026 0.4476078092501954 0.5244434382647728; \u2026 ; 0.20054927953972762 0.19293310548992137 \u2026 0.4423561795343221 0.45279935857264714; 0.2781285320931762 0.2056391355970853 \u2026 0.4695128732448035 0.41903900630023805])\n

The output of a BONSampler (whether at the seeding or refinement step) is always a tuple, storing in the first position a vector of CartesianIndex elements, and in the second position the matrix given as input. We can have a look at the first five points:

first(pack)[1:5]\n
5-element Vector{CartesianIndex}:\n CartesianIndex(81, 10)\n CartesianIndex(19, 44)\n CartesianIndex(69, 77)\n CartesianIndex(44, 21)\n CartesianIndex(94, 55)\n

Although returning the input matrix may seem redundant, it actually allows to chain samplers together to build pipelines that take a matrix as input, and return a set of places to sample as outputs; an example is given below.

The positions of locations to sample are given as a vector of CartesianIndex, which are coordinates in the uncertainty matrix. Once we have generated a candidate proposal, we can further refine it using a BONRefiner \u2013 in this case, AdaptiveSpatial, which performs adaptive spatial sampling (maximizing the distribution of entropy while minimizing spatial auto-correlation).

candidates, uncertainty = pack\nlocations, _ = refine(candidates, AdaptiveSpatial(; numpoints = 50), uncertainty)\nlocations[1:5]\n
5-element Vector{CartesianIndex}:\n CartesianIndex(10, 1)\n CartesianIndex(3, 44)\n CartesianIndex(1, 49)\n CartesianIndex(4, 50)\n CartesianIndex(8, 54)\n

The reason we start from a candidate set of points is that some algorithms struggle with full landscapes, and work much better with a sub-sample of them. There is no hard rule (or no heuristic) to get a sense for how many points should be generated at the seeding step, and so experimentation is a must!

The previous code examples used a version of the seed and refine functions that is very useful if you want to change arguments between steps, or examine the content of the candidate pool of points. In addition to this syntax, both functions have a curried version that allows chaining them together using pipes (|>):

locations =\n    U |>\n    seed(BalancedAcceptance(; numpoints = 200)) |>\n    refine(AdaptiveSpatial(; numpoints = 50)) |>\n    first\n
50-element Vector{CartesianIndex}:\n CartesianIndex(1, 60)\n CartesianIndex(25, 50)\n CartesianIndex(4, 62)\n CartesianIndex(28, 47)\n CartesianIndex(27, 43)\n CartesianIndex(32, 49)\n CartesianIndex(34, 53)\n CartesianIndex(8, 65)\n CartesianIndex(39, 52)\n CartesianIndex(41, 54)\n \u22ee\n CartesianIndex(67, 9)\n CartesianIndex(42, 42)\n CartesianIndex(92, 76)\n CartesianIndex(11, 20)\n CartesianIndex(61, 53)\n CartesianIndex(36, 87)\n CartesianIndex(86, 31)\n CartesianIndex(23, 65)\n CartesianIndex(73, 98)\n

This works because seed and refine have curried versions that can be used directly in a pipeline. Proposed sampling locations can then be overlayed onto the original uncertainty matrix:

plt = heatmap(U)\n#scatter!(plt, [x[1] for x in locations], [x[2] for x in locations], ms=2.5, mc=:white, label=\"\")\n

"},{"location":"vignettes/uniqueness/","title":"Uniqueness.jl","text":""},{"location":"vignettes/uniqueness/#selecting-environmentally-unique-locations","title":"Selecting environmentally unique locations","text":"

For some applications, we want to sample a set of locations that cover a broad range of values in environment space. Another way to rephrase this problem is to say we want to find the set of points with the least covariance in their environmental values.

To do this, we use a BONRefiner called Uniqueness. We'll start by loading the required packages.

using BiodiversityObservationNetworks\nusing SpeciesDistributionToolkit\nusing StatsBase\nusing NeutralLandscapes\nusing CairoMakie\n
\u250c Error: Error during loading of extension SDMToolkitExt of BiodiversityObservationNetworks, use `Base.retry_load_extensions()` to retry.\n\u2502   exception =\n\u2502    1-element ExceptionStack:\n\u2502    ArgumentError: Package SDMToolkitExt [75f002dd-b8c9-5168-9854-88ac03dd3cdb] is required but does not seem to be installed:\n\u2502     - Run `Pkg.instantiate()` to install all recorded dependencies.\n\u2502\n\u2502    Stacktrace:\n\u2502      [1] _require(pkg::Base.PkgId, env::Nothing)\n\u2502        @ Base ./loading.jl:1774\n\u2502      [2] _require_prelocked(uuidkey::Base.PkgId, env::Nothing)\n\u2502        @ Base ./loading.jl:1660\n\u2502      [3] _require_prelocked(uuidkey::Base.PkgId)\n\u2502        @ Base ./loading.jl:1658\n\u2502      [4] run_extension_callbacks(extid::Base.ExtensionId)\n\u2502        @ Base ./loading.jl:1255\n\u2502      [5] run_extension_callbacks(pkgid::Base.PkgId)\n\u2502        @ Base ./loading.jl:1290\n\u2502      [6] run_package_callbacks(modkey::Base.PkgId)\n\u2502        @ Base ./loading.jl:1124\n\u2502      [7] _require_prelocked(uuidkey::Base.PkgId, env::String)\n\u2502        @ Base ./loading.jl:1667\n\u2502      [8] macro expansion\n\u2502        @ ./loading.jl:1648 [inlined]\n\u2502      [9] macro expansion\n\u2502        @ ./lock.jl:267 [inlined]\n\u2502     [10] require(into::Module, mod::Symbol)\n\u2502        @ Base ./loading.jl:1611\n\u2502     [11] eval\n\u2502        @ ./boot.jl:370 [inlined]\n\u2502     [12] #17\n\u2502        @ ~/.julia/packages/Documenter/bYYzK/src/Expanders.jl:629 [inlined]\n\u2502     [13] cd(f::Documenter.Expanders.var\"#17#19\"{Module, Expr}, dir::String)\n\u2502        @ Base.Filesystem ./file.jl:112\n\u2502     [14] (::Documenter.Expanders.var\"#16#18\"{Documenter.Documents.Page, Module, Expr})()\n\u2502        @ Documenter.Expanders ~/.julia/packages/Documenter/bYYzK/src/Expanders.jl:628\n\u2502     [15] (::IOCapture.var\"#4#7\"{DataType, Documenter.Expanders.var\"#16#18\"{Documenter.Documents.Page, Module, Expr}, IOContext{Base.PipeEndpoint}, IOContext{Base.PipeEndpoint}, Base.PipeEndpoint, Base.PipeEndpoint})()\n\u2502        @ IOCapture ~/.julia/packages/IOCapture/Rzdxd/src/IOCapture.jl:161\n\u2502     [16] with_logstate(f::Function, logstate::Any)\n\u2502        @ Base.CoreLogging ./logging.jl:514\n\u2502     [17] with_logger\n\u2502        @ ./logging.jl:626 [inlined]\n\u2502     [18] capture(f::Documenter.Expanders.var\"#16#18\"{Documenter.Documents.Page, Module, Expr}; rethrow::Type, color::Bool, passthrough::Bool, capture_buffer::IOBuffer)\n\u2502        @ IOCapture ~/.julia/packages/IOCapture/Rzdxd/src/IOCapture.jl:158\n\u2502     [19] runner(#unused#::Type{Documenter.Expanders.ExampleBlocks}, x::Markdown.Code, page::Documenter.Documents.Page, doc::Documenter.Documents.Document)\n\u2502        @ Documenter.Expanders ~/.julia/packages/Documenter/bYYzK/src/Expanders.jl:627\n\u2502     [20] dispatch(::Type{Documenter.Expanders.ExpanderPipeline}, ::Markdown.Code, ::Vararg{Any})\n\u2502        @ Documenter.Utilities.Selectors ~/.julia/packages/Documenter/bYYzK/src/Utilities/Selectors.jl:170\n\u2502     [21] expand(doc::Documenter.Documents.Document)\n\u2502        @ Documenter.Expanders ~/.julia/packages/Documenter/bYYzK/src/Expanders.jl:42\n\u2502     [22] runner(#unused#::Type{Documenter.Builder.ExpandTemplates}, doc::Documenter.Documents.Document)\n\u2502        @ Documenter.Builder ~/.julia/packages/Documenter/bYYzK/src/Builder.jl:226\n\u2502     [23] dispatch(#unused#::Type{Documenter.Builder.DocumentPipeline}, x::Documenter.Documents.Document)\n\u2502        @ Documenter.Utilities.Selectors ~/.julia/packages/Documenter/bYYzK/src/Utilities/Selectors.jl:170\n\u2502     [24] #2\n\u2502        @ ~/.julia/packages/Documenter/bYYzK/src/Documenter.jl:273 [inlined]\n\u2502     [25] cd(f::Documenter.var\"#2#3\"{Documenter.Documents.Document}, dir::String)\n\u2502        @ Base.Filesystem ./file.jl:112\n\u2502     [26] #makedocs#1\n\u2502        @ ~/.julia/packages/Documenter/bYYzK/src/Documenter.jl:272 [inlined]\n\u2502     [27] top-level scope\n\u2502        @ ~/work/BiodiversityObservationNetworks.jl/BiodiversityObservationNetworks.jl/docs/make.jl:10\n\u2502     [28] include(mod::Module, _path::String)\n\u2502        @ Base ./Base.jl:457\n\u2502     [29] exec_options(opts::Base.JLOptions)\n\u2502        @ Base ./client.jl:307\n\u2502     [30] _start()\n\u2502        @ Base ./client.jl:522\n\u2514 @ Base loading.jl:1261\n[ Info: Loading NeutralLandscapes support for SimpleSDMLayers.jl...\n

Consider setting your SDMLAYERS_PATH

When accessing data using SimpleSDMDatasets.jl, it is best to set the SDM_LAYERSPATH environmental variable to tell SimpleSDMDatasets.jl where to download data. This can be done by setting ENV[\"SDMLAYERS_PATH\"] = \"/home/user/Data/\" or similar in the ~/.julia/etc/julia/startup.jl file. (Note this will be different depending on where julia is installed.)

bbox = (left=-83.0, bottom=46.4, right=-55.2, top=63.7);\ntemp, precip, elevation =\n    convert(Float32, SimpleSDMPredictor(RasterData(WorldClim2, AverageTemperature); bbox...)),\n    convert(Float32, SimpleSDMPredictor(RasterData(WorldClim2, Precipitation); bbox...)),\n    convert(Float32, SimpleSDMPredictor(RasterData(WorldClim2, Elevation); bbox...));\n
(SDM response \u2192 105\u00d7167 grid with 11478 Float32-valued cells, SDM response \u2192 105\u00d7167 grid with 11478 Float32-valued cells, SDM response \u2192 105\u00d7167 grid with 11478 Float32-valued cells)\n

Now we'll use the stack function to combine our four environmental layers into a single, 3-dimensional array, which we'll pass to our Uniqueness refiner.

```@example 1 layers = BiodiversityObservationNetworks.stack([temp,precip,elevation]);

this requires NeutralLandscapes v0.1.2\n\n\n```julia\nuncert = rand(MidpointDisplacement(0.8), size(temp), mask=temp);\nheatmap(uncert)\n

Now we'll get a set of candidate points from a BalancedAcceptance seeder that has no bias toward higher uncertainty values.

```@example 1 candpts, uncert = uncert |> seed(BalancedAcceptance(numpoints=100, \u03b1=0.0));

Now we'll `refine` our `100` candidate points down to the 30 most environmentally unique.\n\n\n```@example 1\nfinalpts, uncert = refine(candpts, Uniqueness(;numpoints=30, layers=layers), uncert)\n#=\nheatmap(uncert)\nscatter!([p[2] for p in candpts], [p[1] for p in candpts], fa=0.0, msc=:white, label=\"Candidate Points\")\nscatter!([p[2] for p in finalpts], [p[1] for p in finalpts], c=:dodgerblue, msc=:white, label=\"Selected Points\")=#\n

"}]} \ No newline at end of file +{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":""},{"location":"#biodiversityobservationnetworksjl","title":"BiodiversityObservationNetworks.jl","text":"

The purpose of this package is to provide a high-level, extensible, modular interface to the selection of sampling point for biodiversity processes in space. It is based around a collection of types representing point selection algorithms, used to select the most informative sampling points based on raster data. Specifically, many algorithms work from a layer indicating entropy of a model based prediction at each location.

This package is in development

The BiodiversityObservationNetworks.jl package is currently under development. The API is not expected to change a lot, but it may change in order to facilitate the integration of new features.

"},{"location":"#high-level-types","title":"High-level types","text":"

# BiodiversityObservationNetworks.BONSampler \u2014 Type.

BONSampler\n

A union of the abstract types BONSeeder and BONRefiner. Both types return a tuple with the coordinates as a vector of CartesianIndex, and the weight matrix as a Matrix of AbstractFloat, in that order.

source

# BiodiversityObservationNetworks.BONSeeder \u2014 Type.

abstract type BONSeeder end\n

A BONSeeder is an algorithm for proposing sampling locations using a raster of weights, represented as a matrix, in each cell.

source

# BiodiversityObservationNetworks.BONRefiner \u2014 Type.

abstract type BONRefiner end\n

A BONRefiner is an algorithm for proposing sampling locations by refining a set of candidate points to a smaller set of 'best' points.

source

"},{"location":"#seeder-and-refiner-functions","title":"Seeder and refiner functions","text":"

# BiodiversityObservationNetworks.seed \u2014 Function.

seed(sampler::ST, uncertainty::Matrix{T})\n

Produces a set of candidate sampling locations in a vector coords of length numpoints from a raster uncertainty using sampler, where sampler is a BONSeeder.

source

seed!(coords::Vector{CartesianIndex}, sampler::ST)\n

The curried version of seed!, which returns a function that acts on the input uncertainty layer passed to the curried function (u below).

source

# BiodiversityObservationNetworks.seed! \u2014 Function.

seed!(coords::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})\n

Puts a set of candidate sampling locations in the preallocated vector coords from a raster uncertainty using sampler, where sampler is a BONSeeder.

  • Seeder's work on rasters, refiners work on set of coordinates.

source

seed!(coords::Vector{CartesianIndex}, sampler::ST)\n

The curried version of seed!, which returns a function that acts on the input uncertainty layer passed to the curried function (u below).

source

# BiodiversityObservationNetworks.refine \u2014 Function.

refine(pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})\n

Refines a set of candidate sampling locations and returns a vector coords of length numpoints from a vector of coordinates pool using sampler, where sampler is a BONRefiner.

source

refine(sampler::BONRefiner)\n

Returns a curried function of refine with two methods: both are using the output of seed, one in its packed form, the other in its splatted form.

source

refine(pack, sampler::BONRefiner)\n

Calls refine on the appropriatedly splatted version of pack.

source

# BiodiversityObservationNetworks.refine! \u2014 Function.

refine!(cooords::Vector{CartesianIndex}, pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})\n

Refines a set of candidate sampling locations in the preallocated vector coords from a vector of coordinates pool using sampler, where sampler is a BONRefiner.

source

refine!(cooords::Vector{CartesianIndex}, pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})\n

The curried version of refine!, which returns a function that acts on the input coordinate pool passed to the curried function (p below).

source

"},{"location":"#seeder-algorithms","title":"Seeder algorithms","text":"

# BiodiversityObservationNetworks.BalancedAcceptance \u2014 Type.

BalancedAcceptance\n

A BONSeeder that uses Balanced-Acceptance Sampling (Van-dem-Bates et al. 2017 https://doi.org/10.1111/2041-210X.13003)

source

"},{"location":"#refiner-algorithms","title":"Refiner algorithms","text":"

# BiodiversityObservationNetworks.AdaptiveSpatial \u2014 Type.

AdaptiveSpatial\n

...

numpoints, an Integer (def. 50), specifying the number of points to use.

\u03b1, an AbstractFloat (def. 1.0), specifying ...

source

# BiodiversityObservationNetworks.Uniqueness \u2014 Type.

Uniqueness\n

A BONRefiner

source

"},{"location":"#helper-functions","title":"Helper functions","text":"

# BiodiversityObservationNetworks.squish \u2014 Function.

squish(layers, W, \u03b1)\n

Takes a set of n layers and squishes them down to a single layer.

    numcolumns = size(W,2)\n    for i in 1:numcolumns\n        W[:,i] ./= sum(W[:,i])\n    end\n

For a coordinate in the raster (i,j), denote the vector of values across all locations at that coordinate v\u20d7\u1d62\u2c7c. The value at that coordinate in squished layer, s\u20d7\u1d62\u2c7c, is computed in two steps.

(1): First we apply a weights matrix, W``, withnrows andmcolumns (m<n), to reduce the initialnlayers down to a set ofm` layers, each of which corresponds to a particular target of optimization. For example, we may want to propose sampling locations that are optimized to best sample abalance multiple criteria, like (a) the current distribution of a species and (b) if that distribution is changing over time.

Each entry in the weights matrix W corresponds to the 'importance' of the layer in the corresponding row to the successful measurement of the target of the corresponding column. As such, each column of W must sum to 1.0. using Optim

For each location, the value of the condensed layer t\u1d62, corresponding to target i, at coordinate (i,j) is given by the dot product of v\u20d7\u1d62\u2c7c and the i-th column of W.

(2): Apply a weighted average across each target layer. To produce the final output layer, we apply a weighted average to each target layer, where the weights are provided in the vector \u03b1\u20d7 of length m.

The final value of the squished layer at (i,j) is given by s\u20d7\u1d62\u2c7c = \u2211\u2093 \u03b1\u2093*t\u1d62\u2c7c(x), where t\u1d62\u2c7c(x) is the value of the x-th target layer at (i,j).

source

# BiodiversityObservationNetworks.entropize! \u2014 Function.

entropize!(U::Matrix{AbstractFloat}, A::Matrix{Number})\n

This function turns a matrix A (storing measurement values) into pixel-wise entropy values, stored in a matrix U (that is previously allocated).

Pixel-wise entropy is determined by measuring the empirical probability of randomly picking a value in the matrix that is either lower or higher than the pixel value. The entropy of both these probabilities are calculated using the -p\u00d7log(2,p) formula. The entropy of the pixel is the sum of the two entropies, so that it is close to 1 for values close to the median, and close to 0 for values close to the extreme of the distribution.

source

# BiodiversityObservationNetworks.entropize \u2014 Function.

entropize(A::Matrix{Number})\n

Allocation version of entropize!.

source

"},{"location":"bibliography/","title":"Bibliography","text":""},{"location":"bibliography/#references","title":"References","text":""},{"location":"vignettes/entropize/","title":"Entropize","text":""},{"location":"vignettes/entropize/#getting-the-entropy-matrix","title":"Getting the entropy matrix","text":"

For some applications, we want to place points to capture the maximum amount of information, which is to say that we want to sample a balance of entropy values, as opposed to absolute values. In this vignette, we will walk through an example using the entropize function to convert raw data to entropy values.

using BiodiversityObservationNetworks\nusing NeutralLandscapes\nusing CairoMakie\n

Entropy is problem-specific

The solution presented in this vignette is a least-assumption solution based on the empirical values given in a matrix of measurements. In a lot of situations, this is not the entropy that you want. For example, if your pixels are storing probabilities of Bernoulli events, you can directly use the entropy of the events in the entropy matrix.

We start by generating a random matrix of measurements:

measurements = rand(MidpointDisplacement(), (200, 200)) .* 100\nheatmap(measurements)\n

Using the entropize function will convert these values into entropy at the pixel scale:

U = entropize(measurements)\nheatmap(U)\n

The values closest to the median of the distribution have the highest entropy, and the values closest to its extrema have an entropy of 0. The entropy matrix is guaranteed to have values on the unit interval.

We can use entropize as part of a pipeline, and overlay the points optimized based on entropy on the measurement map:

locations =\n    measurements |> entropize |> seed(BalancedAcceptance(; numpoints = 100)) |> first\nheatmap(U)\n

"},{"location":"vignettes/overview/","title":"Overview","text":""},{"location":"vignettes/overview/#an-introduction-to-biodiversityobservationnetworks","title":"An introduction to BiodiversityObservationNetworks","text":"

In this vignette, we will walk through the basic functionalities of the package, by generating a random uncertainty matrix, and then using a seeder and a refiner to decide which locations should be sampled in order to gain more insights about the process generating this entropy.

using BiodiversityObservationNetworks\nusing NeutralLandscapes\nusing CairoMakie\n

In order to simplify the process, we will use the NeutralLandscapes package to generate a 100\u00d7100 pixels landscape, where each cell represents the entropy (or information content) in a unit we can sample:

U = rand(MidpointDisplacement(0.5), (100, 100))\nheatmap(U)\n

In practice, this uncertainty matrix is likely to be derived from an application of the hyper-parameters optimization step, which is detailed in other vignettes.

The first step of defining a series of locations to sample is to use a BONSeeder, which will generate a number of relatively coarse proposals that cover the entire landscape, and have a balanced distribution in space. We do so using the BalancedAcceptance sampler, which can be tweaked to capture more (or less) uncertainty. To start with, we will extract 200 candidate points, i.e. 200 possible locations which will then be refined.

pack = seed(BalancedAcceptance(; numpoints = 200), U);\n
(CartesianIndex[CartesianIndex(8, 96), CartesianIndex(58, 11), CartesianIndex(33, 44), CartesianIndex(83, 77), CartesianIndex(20, 22), CartesianIndex(70, 55), CartesianIndex(45, 88), CartesianIndex(95, 33), CartesianIndex(4, 66), CartesianIndex(54, 100)  \u2026  CartesianIndex(38, 8), CartesianIndex(88, 41), CartesianIndex(7, 75), CartesianIndex(57, 19), CartesianIndex(32, 52), CartesianIndex(82, 86), CartesianIndex(20, 30), CartesianIndex(70, 64), CartesianIndex(45, 97), CartesianIndex(95, 2)], [0.4863215941562187 0.5474279971396894 \u2026 0.36580931883696505 0.27899156740470715; 0.4513559228215274 0.5185207622794403 \u2026 0.34948599040746564 0.2581362551882851; \u2026 ; 0.09279607503606678 0.07077776915686923 \u2026 0.2161624071116984 0.3094713783786546; 0.08152059396290053 0.0275472787091889 \u2026 0.22023075038227477 0.32201840163660145])\n

The output of a BONSampler (whether at the seeding or refinement step) is always a tuple, storing in the first position a vector of CartesianIndex elements, and in the second position the matrix given as input. We can have a look at the first five points:

first(pack)[1:5]\n
5-element Vector{CartesianIndex}:\n CartesianIndex(8, 96)\n CartesianIndex(58, 11)\n CartesianIndex(33, 44)\n CartesianIndex(83, 77)\n CartesianIndex(20, 22)\n

Although returning the input matrix may seem redundant, it actually allows to chain samplers together to build pipelines that take a matrix as input, and return a set of places to sample as outputs; an example is given below.

The positions of locations to sample are given as a vector of CartesianIndex, which are coordinates in the uncertainty matrix. Once we have generated a candidate proposal, we can further refine it using a BONRefiner \u2013 in this case, AdaptiveSpatial, which performs adaptive spatial sampling (maximizing the distribution of entropy while minimizing spatial auto-correlation).

candidates, uncertainty = pack\nlocations, _ = refine(candidates, AdaptiveSpatial(; numpoints = 50), uncertainty)\nlocations[1:5]\n
5-element Vector{CartesianIndex}:\n CartesianIndex(94, 39)\n CartesianIndex(95, 33)\n CartesianIndex(92, 30)\n CartesianIndex(99, 36)\n CartesianIndex(87, 31)\n

The reason we start from a candidate set of points is that some algorithms struggle with full landscapes, and work much better with a sub-sample of them. There is no hard rule (or no heuristic) to get a sense for how many points should be generated at the seeding step, and so experimentation is a must!

The previous code examples used a version of the seed and refine functions that is very useful if you want to change arguments between steps, or examine the content of the candidate pool of points. In addition to this syntax, both functions have a curried version that allows chaining them together using pipes (|>):

locations =\n    U |>\n    seed(BalancedAcceptance(; numpoints = 200)) |>\n    refine(AdaptiveSpatial(; numpoints = 50)) |>\n    first\n
50-element Vector{CartesianIndex}:\n CartesianIndex(99, 41)\n CartesianIndex(90, 36)\n CartesianIndex(97, 43)\n CartesianIndex(87, 36)\n CartesianIndex(94, 46)\n CartesianIndex(83, 38)\n CartesianIndex(93, 31)\n CartesianIndex(89, 51)\n CartesianIndex(92, 53)\n CartesianIndex(94, 56)\n \u22ee\n CartesianIndex(62, 98)\n CartesianIndex(37, 3)\n CartesianIndex(25, 69)\n CartesianIndex(75, 14)\n CartesianIndex(50, 47)\n CartesianIndex(100, 80)\n CartesianIndex(2, 25)\n CartesianIndex(52, 58)\n CartesianIndex(27, 91)\n

This works because seed and refine have curried versions that can be used directly in a pipeline. Proposed sampling locations can then be overlayed onto the original uncertainty matrix:

plt = heatmap(U)\n#scatter!(plt, [x[1] for x in locations], [x[2] for x in locations], ms=2.5, mc=:white, label=\"\")\n

"},{"location":"vignettes/uniqueness/","title":"Uniqueness.jl","text":""},{"location":"vignettes/uniqueness/#selecting-environmentally-unique-locations","title":"Selecting environmentally unique locations","text":"

For some applications, we want to sample a set of locations that cover a broad range of values in environment space. Another way to rephrase this problem is to say we want to find the set of points with the least covariance in their environmental values.

To do this, we use a BONRefiner called Uniqueness. We'll start by loading the required packages.

using BiodiversityObservationNetworks\nusing SpeciesDistributionToolkit\nusing StatsBase\nusing NeutralLandscapes\nusing CairoMakie\n
\u250c Error: Error during loading of extension SDMToolkitExt of BiodiversityObservationNetworks, use `Base.retry_load_extensions()` to retry.\n\u2502   exception =\n\u2502    1-element ExceptionStack:\n\u2502    ArgumentError: Package SDMToolkitExt [75f002dd-b8c9-5168-9854-88ac03dd3cdb] is required but does not seem to be installed:\n\u2502     - Run `Pkg.instantiate()` to install all recorded dependencies.\n\u2502\n\u2502    Stacktrace:\n\u2502      [1] _require(pkg::Base.PkgId, env::Nothing)\n\u2502        @ Base ./loading.jl:1774\n\u2502      [2] _require_prelocked(uuidkey::Base.PkgId, env::Nothing)\n\u2502        @ Base ./loading.jl:1660\n\u2502      [3] _require_prelocked(uuidkey::Base.PkgId)\n\u2502        @ Base ./loading.jl:1658\n\u2502      [4] run_extension_callbacks(extid::Base.ExtensionId)\n\u2502        @ Base ./loading.jl:1255\n\u2502      [5] run_extension_callbacks(pkgid::Base.PkgId)\n\u2502        @ Base ./loading.jl:1290\n\u2502      [6] run_package_callbacks(modkey::Base.PkgId)\n\u2502        @ Base ./loading.jl:1124\n\u2502      [7] _require_prelocked(uuidkey::Base.PkgId, env::String)\n\u2502        @ Base ./loading.jl:1667\n\u2502      [8] macro expansion\n\u2502        @ ./loading.jl:1648 [inlined]\n\u2502      [9] macro expansion\n\u2502        @ ./lock.jl:267 [inlined]\n\u2502     [10] require(into::Module, mod::Symbol)\n\u2502        @ Base ./loading.jl:1611\n\u2502     [11] eval\n\u2502        @ ./boot.jl:370 [inlined]\n\u2502     [12] #17\n\u2502        @ ~/.julia/packages/Documenter/bYYzK/src/Expanders.jl:629 [inlined]\n\u2502     [13] cd(f::Documenter.Expanders.var\"#17#19\"{Module, Expr}, dir::String)\n\u2502        @ Base.Filesystem ./file.jl:112\n\u2502     [14] (::Documenter.Expanders.var\"#16#18\"{Documenter.Documents.Page, Module, Expr})()\n\u2502        @ Documenter.Expanders ~/.julia/packages/Documenter/bYYzK/src/Expanders.jl:628\n\u2502     [15] (::IOCapture.var\"#4#7\"{DataType, Documenter.Expanders.var\"#16#18\"{Documenter.Documents.Page, Module, Expr}, IOContext{Base.PipeEndpoint}, IOContext{Base.PipeEndpoint}, Base.PipeEndpoint, Base.PipeEndpoint})()\n\u2502        @ IOCapture ~/.julia/packages/IOCapture/Rzdxd/src/IOCapture.jl:161\n\u2502     [16] with_logstate(f::Function, logstate::Any)\n\u2502        @ Base.CoreLogging ./logging.jl:514\n\u2502     [17] with_logger\n\u2502        @ ./logging.jl:626 [inlined]\n\u2502     [18] capture(f::Documenter.Expanders.var\"#16#18\"{Documenter.Documents.Page, Module, Expr}; rethrow::Type, color::Bool, passthrough::Bool, capture_buffer::IOBuffer)\n\u2502        @ IOCapture ~/.julia/packages/IOCapture/Rzdxd/src/IOCapture.jl:158\n\u2502     [19] runner(#unused#::Type{Documenter.Expanders.ExampleBlocks}, x::Markdown.Code, page::Documenter.Documents.Page, doc::Documenter.Documents.Document)\n\u2502        @ Documenter.Expanders ~/.julia/packages/Documenter/bYYzK/src/Expanders.jl:627\n\u2502     [20] dispatch(::Type{Documenter.Expanders.ExpanderPipeline}, ::Markdown.Code, ::Vararg{Any})\n\u2502        @ Documenter.Utilities.Selectors ~/.julia/packages/Documenter/bYYzK/src/Utilities/Selectors.jl:170\n\u2502     [21] expand(doc::Documenter.Documents.Document)\n\u2502        @ Documenter.Expanders ~/.julia/packages/Documenter/bYYzK/src/Expanders.jl:42\n\u2502     [22] runner(#unused#::Type{Documenter.Builder.ExpandTemplates}, doc::Documenter.Documents.Document)\n\u2502        @ Documenter.Builder ~/.julia/packages/Documenter/bYYzK/src/Builder.jl:226\n\u2502     [23] dispatch(#unused#::Type{Documenter.Builder.DocumentPipeline}, x::Documenter.Documents.Document)\n\u2502        @ Documenter.Utilities.Selectors ~/.julia/packages/Documenter/bYYzK/src/Utilities/Selectors.jl:170\n\u2502     [24] #2\n\u2502        @ ~/.julia/packages/Documenter/bYYzK/src/Documenter.jl:273 [inlined]\n\u2502     [25] cd(f::Documenter.var\"#2#3\"{Documenter.Documents.Document}, dir::String)\n\u2502        @ Base.Filesystem ./file.jl:112\n\u2502     [26] #makedocs#1\n\u2502        @ ~/.julia/packages/Documenter/bYYzK/src/Documenter.jl:272 [inlined]\n\u2502     [27] top-level scope\n\u2502        @ ~/work/BiodiversityObservationNetworks.jl/BiodiversityObservationNetworks.jl/docs/make.jl:10\n\u2502     [28] include(mod::Module, _path::String)\n\u2502        @ Base ./Base.jl:457\n\u2502     [29] exec_options(opts::Base.JLOptions)\n\u2502        @ Base ./client.jl:307\n\u2502     [30] _start()\n\u2502        @ Base ./client.jl:522\n\u2514 @ Base loading.jl:1261\n[ Info: Loading NeutralLandscapes support for SimpleSDMLayers.jl...\n

Consider setting your SDMLAYERS_PATH

When accessing data using SimpleSDMDatasets.jl, it is best to set the SDM_LAYERSPATH environmental variable to tell SimpleSDMDatasets.jl where to download data. This can be done by setting ENV[\"SDMLAYERS_PATH\"] = \"/home/user/Data/\" or similar in the ~/.julia/etc/julia/startup.jl file. (Note this will be different depending on where julia is installed.)

bbox = (left=-83.0, bottom=46.4, right=-55.2, top=63.7);\ntemp, precip, elevation =\n    convert(Float32, SimpleSDMPredictor(RasterData(WorldClim2, AverageTemperature); bbox...)),\n    convert(Float32, SimpleSDMPredictor(RasterData(WorldClim2, Precipitation); bbox...)),\n    convert(Float32, SimpleSDMPredictor(RasterData(WorldClim2, Elevation); bbox...));\n
(SDM response \u2192 105\u00d7167 grid with 11478 Float32-valued cells, SDM response \u2192 105\u00d7167 grid with 11478 Float32-valued cells, SDM response \u2192 105\u00d7167 grid with 11478 Float32-valued cells)\n

Now we'll use the stack function to combine our four environmental layers into a single, 3-dimensional array, which we'll pass to our Uniqueness refiner.

```@example 1 layers = BiodiversityObservationNetworks.stack([temp,precip,elevation]);

this requires NeutralLandscapes v0.1.2\n\n\n```julia\nuncert = rand(MidpointDisplacement(0.8), size(temp), mask=temp);\nheatmap(uncert)\n

Now we'll get a set of candidate points from a BalancedAcceptance seeder that has no bias toward higher uncertainty values.

```@example 1 candpts, uncert = uncert |> seed(BalancedAcceptance(numpoints=100, \u03b1=0.0));

Now we'll `refine` our `100` candidate points down to the 30 most environmentally unique.\n\n\n```@example 1\nfinalpts, uncert = refine(candpts, Uniqueness(;numpoints=30, layers=layers), uncert)\n#=\nheatmap(uncert)\nscatter!([p[2] for p in candpts], [p[1] for p in candpts], fa=0.0, msc=:white, label=\"Candidate Points\")\nscatter!([p[2] for p in finalpts], [p[1] for p in finalpts], c=:dodgerblue, msc=:white, label=\"Selected Points\")=#\n

"}]} \ No newline at end of file diff --git a/previews/PR65/vignettes/ctvidzn.png b/previews/PR65/vignettes/ctvidzn.png new file mode 100644 index 0000000..407d98b Binary files /dev/null and b/previews/PR65/vignettes/ctvidzn.png differ diff --git a/previews/PR65/vignettes/dtzzbzu.png b/previews/PR65/vignettes/dtzzbzu.png new file mode 100644 index 0000000..f0f72ed Binary files /dev/null and b/previews/PR65/vignettes/dtzzbzu.png differ diff --git a/previews/PR65/vignettes/entropize/index.html b/previews/PR65/vignettes/entropize/index.html index 048514f..dbdce25 100644 --- a/previews/PR65/vignettes/entropize/index.html +++ b/previews/PR65/vignettes/entropize/index.html @@ -409,19 +409,19 @@

Getting the entropy matrix

measurements = rand(MidpointDisplacement(), (200, 200)) .* 100
 heatmap(measurements)
 
-

+

Using the entropize function will convert these values into entropy at the pixel scale:

U = entropize(measurements)
 heatmap(U)
 
-

+

The values closest to the median of the distribution have the highest entropy, and the values closest to its extrema have an entropy of 0. The entropy matrix is guaranteed to have values on the unit interval.

We can use entropize as part of a pipeline, and overlay the points optimized based on entropy on the measurement map:

locations =
     measurements |> entropize |> seed(BalancedAcceptance(; numpoints = 100)) |> first
 heatmap(U)
 
-

+

diff --git a/previews/PR65/vignettes/gbfwvxs.png b/previews/PR65/vignettes/gbfwvxs.png deleted file mode 100644 index 79f334a..0000000 Binary files a/previews/PR65/vignettes/gbfwvxs.png and /dev/null differ diff --git a/previews/PR65/vignettes/kdskdad.png b/previews/PR65/vignettes/kdskdad.png deleted file mode 100644 index 9a21033..0000000 Binary files a/previews/PR65/vignettes/kdskdad.png and /dev/null differ diff --git a/previews/PR65/vignettes/overview/index.html b/previews/PR65/vignettes/overview/index.html index 573dd3f..5ecbe80 100644 --- a/previews/PR65/vignettes/overview/index.html +++ b/previews/PR65/vignettes/overview/index.html @@ -405,22 +405,22 @@

An introduction to B
U = rand(MidpointDisplacement(0.5), (100, 100))
 heatmap(U)
 
-

+

In practice, this uncertainty matrix is likely to be derived from an application of the hyper-parameters optimization step, which is detailed in other vignettes.

The first step of defining a series of locations to sample is to use a BONSeeder, which will generate a number of relatively coarse proposals that cover the entire landscape, and have a balanced distribution in space. We do so using the BalancedAcceptance sampler, which can be tweaked to capture more (or less) uncertainty. To start with, we will extract 200 candidate points, i.e. 200 possible locations which will then be refined.

pack = seed(BalancedAcceptance(; numpoints = 200), U);
 
-
(CartesianIndex[CartesianIndex(81, 10), CartesianIndex(19, 44), CartesianIndex(69, 77), CartesianIndex(44, 21), CartesianIndex(94, 55), CartesianIndex(12, 88), CartesianIndex(62, 33), CartesianIndex(37, 66), CartesianIndex(87, 99), CartesianIndex(25, 4)  …  CartesianIndex(55, 44), CartesianIndex(30, 77), CartesianIndex(80, 22), CartesianIndex(18, 55), CartesianIndex(68, 89), CartesianIndex(43, 33), CartesianIndex(93, 66), CartesianIndex(12, 100), CartesianIndex(62, 1), CartesianIndex(37, 35)], [0.6457023076171001 0.6530873362023615 … 0.4802747697941122 0.495825668828999; 0.6662755323284345 0.6646109378460477 … 0.4476078092501954 0.5244434382647728; … ; 0.20054927953972762 0.19293310548992137 … 0.4423561795343221 0.45279935857264714; 0.2781285320931762 0.2056391355970853 … 0.4695128732448035 0.41903900630023805])
+
(CartesianIndex[CartesianIndex(8, 96), CartesianIndex(58, 11), CartesianIndex(33, 44), CartesianIndex(83, 77), CartesianIndex(20, 22), CartesianIndex(70, 55), CartesianIndex(45, 88), CartesianIndex(95, 33), CartesianIndex(4, 66), CartesianIndex(54, 100)  …  CartesianIndex(38, 8), CartesianIndex(88, 41), CartesianIndex(7, 75), CartesianIndex(57, 19), CartesianIndex(32, 52), CartesianIndex(82, 86), CartesianIndex(20, 30), CartesianIndex(70, 64), CartesianIndex(45, 97), CartesianIndex(95, 2)], [0.4863215941562187 0.5474279971396894 … 0.36580931883696505 0.27899156740470715; 0.4513559228215274 0.5185207622794403 … 0.34948599040746564 0.2581362551882851; … ; 0.09279607503606678 0.07077776915686923 … 0.2161624071116984 0.3094713783786546; 0.08152059396290053 0.0275472787091889 … 0.22023075038227477 0.32201840163660145])
 

The output of a BONSampler (whether at the seeding or refinement step) is always a tuple, storing in the first position a vector of CartesianIndex elements, and in the second position the matrix given as input. We can have a look at the first five points:

first(pack)[1:5]
 
5-element Vector{CartesianIndex}:
- CartesianIndex(81, 10)
- CartesianIndex(19, 44)
- CartesianIndex(69, 77)
- CartesianIndex(44, 21)
- CartesianIndex(94, 55)
+ CartesianIndex(8, 96)
+ CartesianIndex(58, 11)
+ CartesianIndex(33, 44)
+ CartesianIndex(83, 77)
+ CartesianIndex(20, 22)
 

Although returning the input matrix may seem redundant, it actually allows to chain samplers together to build pipelines that take a matrix as input, and return a set of places to sample as outputs; an example is given below.

The positions of locations to sample are given as a vector of CartesianIndex, which are coordinates in the uncertainty matrix. Once we have generated a candidate proposal, we can further refine it using a BONRefiner – in this case, AdaptiveSpatial, which performs adaptive spatial sampling (maximizing the distribution of entropy while minimizing spatial auto-correlation).

@@ -429,11 +429,11 @@

An introduction to B locations[1:5]

5-element Vector{CartesianIndex}:
- CartesianIndex(10, 1)
- CartesianIndex(3, 44)
- CartesianIndex(1, 49)
- CartesianIndex(4, 50)
- CartesianIndex(8, 54)
+ CartesianIndex(94, 39)
+ CartesianIndex(95, 33)
+ CartesianIndex(92, 30)
+ CartesianIndex(99, 36)
+ CartesianIndex(87, 31)
 

The reason we start from a candidate set of points is that some algorithms struggle with full landscapes, and work much better with a sub-sample of them. There is no hard rule (or no heuristic) to get a sense for how many points should be generated at the seeding step, and so experimentation is a must!

The previous code examples used a version of the seed and refine functions that is very useful if you want to change arguments between steps, or examine the content of the candidate pool of points. In addition to this syntax, both functions have a curried version that allows chaining them together using pipes (|>):

@@ -444,32 +444,32 @@

An introduction to B first

50-element Vector{CartesianIndex}:
- CartesianIndex(1, 60)
- CartesianIndex(25, 50)
- CartesianIndex(4, 62)
- CartesianIndex(28, 47)
- CartesianIndex(27, 43)
- CartesianIndex(32, 49)
- CartesianIndex(34, 53)
- CartesianIndex(8, 65)
- CartesianIndex(39, 52)
- CartesianIndex(41, 54)
+ CartesianIndex(99, 41)
+ CartesianIndex(90, 36)
+ CartesianIndex(97, 43)
+ CartesianIndex(87, 36)
+ CartesianIndex(94, 46)
+ CartesianIndex(83, 38)
+ CartesianIndex(93, 31)
+ CartesianIndex(89, 51)
+ CartesianIndex(92, 53)
+ CartesianIndex(94, 56)
  ⋮
- CartesianIndex(67, 9)
- CartesianIndex(42, 42)
- CartesianIndex(92, 76)
- CartesianIndex(11, 20)
- CartesianIndex(61, 53)
- CartesianIndex(36, 87)
- CartesianIndex(86, 31)
- CartesianIndex(23, 65)
- CartesianIndex(73, 98)
+ CartesianIndex(62, 98)
+ CartesianIndex(37, 3)
+ CartesianIndex(25, 69)
+ CartesianIndex(75, 14)
+ CartesianIndex(50, 47)
+ CartesianIndex(100, 80)
+ CartesianIndex(2, 25)
+ CartesianIndex(52, 58)
+ CartesianIndex(27, 91)
 

This works because seed and refine have curried versions that can be used directly in a pipeline. Proposed sampling locations can then be overlayed onto the original uncertainty matrix:

plt = heatmap(U)
 #scatter!(plt, [x[1] for x in locations], [x[2] for x in locations], ms=2.5, mc=:white, label="")
 
-

+

diff --git a/previews/PR65/vignettes/pnqdobu.png b/previews/PR65/vignettes/pnqdobu.png new file mode 100644 index 0000000..47ff863 Binary files /dev/null and b/previews/PR65/vignettes/pnqdobu.png differ diff --git a/previews/PR65/vignettes/roejlsl.png b/previews/PR65/vignettes/roejlsl.png deleted file mode 100644 index c1f2323..0000000 Binary files a/previews/PR65/vignettes/roejlsl.png and /dev/null differ diff --git a/previews/PR65/vignettes/sbnxsfm.png b/previews/PR65/vignettes/sbnxsfm.png new file mode 100644 index 0000000..6e99c83 Binary files /dev/null and b/previews/PR65/vignettes/sbnxsfm.png differ diff --git a/previews/PR65/vignettes/uniqueness/index.html b/previews/PR65/vignettes/uniqueness/index.html index da3b6e7..3c9a576 100644 --- a/previews/PR65/vignettes/uniqueness/index.html +++ b/previews/PR65/vignettes/uniqueness/index.html @@ -494,7 +494,7 @@

Selecting environmentally un uncert = rand(MidpointDisplacement(0.8), size(temp), mask=temp); heatmap(uncert)

-

+

Now we'll get a set of candidate points from a BalancedAcceptance seeder that has no bias toward higher uncertainty values.

```@example 1 candpts, uncert = uncert |> seed(BalancedAcceptance(numpoints=100, α=0.0)); diff --git a/previews/PR65/vignettes/wlpwwqe.png b/previews/PR65/vignettes/wlpwwqe.png new file mode 100644 index 0000000..6e99c83 Binary files /dev/null and b/previews/PR65/vignettes/wlpwwqe.png differ diff --git a/previews/PR65/vignettes/yvyrlqc.png b/previews/PR65/vignettes/yvyrlqc.png deleted file mode 100644 index 0c9c80e..0000000 Binary files a/previews/PR65/vignettes/yvyrlqc.png and /dev/null differ diff --git a/previews/PR65/vignettes/zhcpfbl.png b/previews/PR65/vignettes/zhcpfbl.png deleted file mode 100644 index 79f334a..0000000 Binary files a/previews/PR65/vignettes/zhcpfbl.png and /dev/null differ diff --git a/previews/PR65/vignettes/zjzmvqq.png b/previews/PR65/vignettes/zjzmvqq.png new file mode 100644 index 0000000..f0f72ed Binary files /dev/null and b/previews/PR65/vignettes/zjzmvqq.png differ diff --git a/previews/PR65/vignettes/zpglfmw.png b/previews/PR65/vignettes/zpglfmw.png deleted file mode 100644 index c1f2323..0000000 Binary files a/previews/PR65/vignettes/zpglfmw.png and /dev/null differ