diff --git a/previews/PR65/index.html b/previews/PR65/index.html index a276d1c..e924020 100644 --- a/previews/PR65/index.html +++ b/previews/PR65/index.html @@ -533,19 +533,19 @@
BONSampler
A union of the abstract types BONSeeder
and BONRefiner
. Both types return a tuple with the coordinates as a vector of CartesianIndex
, and the weight matrix as a Matrix
of AbstractFloat
, in that order.
#
BiodiversityObservationNetworks.BONSeeder
— Type.
abstract type BONSeeder end
A BONSeeder
is an algorithm for proposing sampling locations using a raster of weights, represented as a matrix, in each cell.
#
BiodiversityObservationNetworks.BONRefiner
— Type.
abstract type BONRefiner end
A BONRefiner
is an algorithm for proposing sampling locations by refining a set of candidate points to a smaller set of 'best' points.
seed(sampler::ST, uncertainty::Matrix{T})
Produces a set of candidate sampling locations in a vector coords
of length numpoints from a raster uncertainty
using sampler
, where sampler
is a BONSeeder
.
seed!(coords::Vector{CartesianIndex}, sampler::ST)
The curried version of seed!
, which returns a function that acts on the input uncertainty layer passed to the curried function (u
below).
#
BiodiversityObservationNetworks.seed!
— Function.
seed!(coords::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})
@@ -567,35 +567,35 @@ Seeder and refiner functions
- Seeder's work on rasters, refiners work on set of coordinates.
-
+
seed!(coords::Vector{CartesianIndex}, sampler::ST)
The curried version of seed!
, which returns a function that acts on the input uncertainty layer passed to the curried function (u
below).
-
+
#
BiodiversityObservationNetworks.refine
— Function.
refine(pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})
Refines a set of candidate sampling locations and returns a vector coords
of length numpoints from a vector of coordinates pool
using sampler
, where sampler
is a BONRefiner
.
-
+
refine(sampler::BONRefiner)
Returns a curried function of refine
with two methods: both are using the output of seed
, one in its packed form, the other in its splatted form.
-
+
refine(pack, sampler::BONRefiner)
Calls refine
on the appropriatedly splatted version of pack
.
-
+
#
BiodiversityObservationNetworks.refine!
— Function.
refine!(cooords::Vector{CartesianIndex}, pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})
Refines a set of candidate sampling locations in the preallocated vector coords
from a vector of coordinates pool
using sampler
, where sampler
is a BONRefiner
.
-
+
refine!(cooords::Vector{CartesianIndex}, pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})
The curried version of refine!
, which returns a function that acts on the input coordinate pool passed to the curried function (p
below).
-
+
Seeder algorithms
@@ -604,7 +604,7 @@ Seeder algorithms
BalancedAcceptance
A BONSeeder
that uses Balanced-Acceptance Sampling (Van-dem-Bates et al. 2017 https://doi.org/10.1111/2041-210X.13003)
-
+
Refiner algorithms
@@ -615,13 +615,13 @@ Refiner algorithms
...
numpoints, an Integer (def. 50), specifying the number of points to use.
α, an AbstractFloat (def. 1.0), specifying ...
-
+
#
BiodiversityObservationNetworks.Uniqueness
— Type.
Uniqueness
A BONRefiner
-
+
Helper functions
@@ -641,20 +641,20 @@ Helper functions
For each location, the value of the condensed layer tᵢ
, corresponding to target i
, at coordinate (i,j) is given by the dot product of v⃗ᵢⱼ and the i
-th column of W
.
(2): Apply a weighted average across each target layer. To produce the final output layer, we apply a weighted average to each target layer, where the weights are provided in the vector α⃗ of length m
.
The final value of the squished layer at (i,j) is given by s⃗ᵢⱼ = ∑ₓ αₓ*tᵢⱼ(x), where tᵢⱼ(x) is the value of the x-th target layer at (i,j).
-
+
#
BiodiversityObservationNetworks.entropize!
— Function.
entropize!(U::Matrix{AbstractFloat}, A::Matrix{Number})
This function turns a matrix A
(storing measurement values) into pixel-wise entropy values, stored in a matrix U
(that is previously allocated).
Pixel-wise entropy is determined by measuring the empirical probability of randomly picking a value in the matrix that is either lower or higher than the pixel value. The entropy of both these probabilities are calculated using the -p×log(2,p) formula. The entropy of the pixel is the sum of the two entropies, so that it is close to 1 for values close to the median, and close to 0 for values close to the extreme of the distribution.
-
+
#
BiodiversityObservationNetworks.entropize
— Function.
entropize(A::Matrix{Number})
Allocation version of entropize!
.
-
+
diff --git a/previews/PR65/search/search_index.json b/previews/PR65/search/search_index.json
index df6fce4..263b606 100644
--- a/previews/PR65/search/search_index.json
+++ b/previews/PR65/search/search_index.json
@@ -1 +1 @@
-{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":""},{"location":"#biodiversityobservationnetworksjl","title":"BiodiversityObservationNetworks.jl","text":"The purpose of this package is to provide a high-level, extensible, modular interface to the selection of sampling point for biodiversity processes in space. It is based around a collection of types representing point selection algorithms, used to select the most informative sampling points based on raster data. Specifically, many algorithms work from a layer indicating entropy of a model based prediction at each location.
This package is in development
The BiodiversityObservationNetworks.jl
package is currently under development. The API is not expected to change a lot, but it may change in order to facilitate the integration of new features.
"},{"location":"#high-level-types","title":"High-level types","text":"# BiodiversityObservationNetworks.BONSampler
\u2014 Type.
BONSampler\n
A union of the abstract types BONSeeder
and BONRefiner
. Both types return a tuple with the coordinates as a vector of CartesianIndex
, and the weight matrix as a Matrix
of AbstractFloat
, in that order.
source
# BiodiversityObservationNetworks.BONSeeder
\u2014 Type.
abstract type BONSeeder end\n
A BONSeeder
is an algorithm for proposing sampling locations using a raster of weights, represented as a matrix, in each cell.
source
# BiodiversityObservationNetworks.BONRefiner
\u2014 Type.
abstract type BONRefiner end\n
A BONRefiner
is an algorithm for proposing sampling locations by refining a set of candidate points to a smaller set of 'best' points.
source
"},{"location":"#seeder-and-refiner-functions","title":"Seeder and refiner functions","text":"# BiodiversityObservationNetworks.seed
\u2014 Function.
seed(sampler::ST, uncertainty::Matrix{T})\n
Produces a set of candidate sampling locations in a vector coords
of length numpoints from a raster uncertainty
using sampler
, where sampler
is a BONSeeder
.
source
seed!(coords::Vector{CartesianIndex}, sampler::ST)\n
The curried version of seed!
, which returns a function that acts on the input uncertainty layer passed to the curried function (u
below).
source
# BiodiversityObservationNetworks.seed!
\u2014 Function.
seed!(coords::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})\n
Puts a set of candidate sampling locations in the preallocated vector coords
from a raster uncertainty
using sampler
, where sampler
is a BONSeeder
.
- Seeder's work on rasters, refiners work on set of coordinates.
source
seed!(coords::Vector{CartesianIndex}, sampler::ST)\n
The curried version of seed!
, which returns a function that acts on the input uncertainty layer passed to the curried function (u
below).
source
# BiodiversityObservationNetworks.refine
\u2014 Function.
refine(pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})\n
Refines a set of candidate sampling locations and returns a vector coords
of length numpoints from a vector of coordinates pool
using sampler
, where sampler
is a BONRefiner
.
source
refine(sampler::BONRefiner)\n
Returns a curried function of refine
with two methods: both are using the output of seed
, one in its packed form, the other in its splatted form.
source
refine(pack, sampler::BONRefiner)\n
Calls refine
on the appropriatedly splatted version of pack
.
source
# BiodiversityObservationNetworks.refine!
\u2014 Function.
refine!(cooords::Vector{CartesianIndex}, pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})\n
Refines a set of candidate sampling locations in the preallocated vector coords
from a vector of coordinates pool
using sampler
, where sampler
is a BONRefiner
.
source
refine!(cooords::Vector{CartesianIndex}, pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})\n
The curried version of refine!
, which returns a function that acts on the input coordinate pool passed to the curried function (p
below).
source
"},{"location":"#seeder-algorithms","title":"Seeder algorithms","text":"# BiodiversityObservationNetworks.BalancedAcceptance
\u2014 Type.
BalancedAcceptance\n
A BONSeeder
that uses Balanced-Acceptance Sampling (Van-dem-Bates et al. 2017 https://doi.org/10.1111/2041-210X.13003)
source
"},{"location":"#refiner-algorithms","title":"Refiner algorithms","text":"# BiodiversityObservationNetworks.AdaptiveSpatial
\u2014 Type.
AdaptiveSpatial\n
...
numpoints, an Integer (def. 50), specifying the number of points to use.
\u03b1, an AbstractFloat (def. 1.0), specifying ...
source
# BiodiversityObservationNetworks.Uniqueness
\u2014 Type.
Uniqueness\n
A BONRefiner
source
"},{"location":"#helper-functions","title":"Helper functions","text":"# BiodiversityObservationNetworks.squish
\u2014 Function.
squish(layers, W, \u03b1)\n
Takes a set of n
layers and squishes them down to a single layer.
numcolumns = size(W,2)\n for i in 1:numcolumns\n W[:,i] ./= sum(W[:,i])\n end\n
For a coordinate in the raster (i,j), denote the vector of values across all locations at that coordinate v\u20d7\u1d62\u2c7c. The value at that coordinate in squished layer, s\u20d7\u1d62\u2c7c, is computed in two steps.
(1): First we apply a weights matrix, W``, with
nrows and
mcolumns (
m<
n), to reduce the initial
nlayers down to a set of
m` layers, each of which corresponds to a particular target of optimization. For example, we may want to propose sampling locations that are optimized to best sample abalance multiple criteria, like (a) the current distribution of a species and (b) if that distribution is changing over time.
Each entry in the weights matrix W
corresponds to the 'importance' of the layer in the corresponding row to the successful measurement of the target of the corresponding column. As such, each column of W
must sum to 1.0. using Optim
For each location, the value of the condensed layer t\u1d62
, corresponding to target i
, at coordinate (i,j) is given by the dot product of v\u20d7\u1d62\u2c7c and the i
-th column of W
.
(2): Apply a weighted average across each target layer. To produce the final output layer, we apply a weighted average to each target layer, where the weights are provided in the vector \u03b1\u20d7 of length m
.
The final value of the squished layer at (i,j) is given by s\u20d7\u1d62\u2c7c = \u2211\u2093 \u03b1\u2093*t\u1d62\u2c7c(x), where t\u1d62\u2c7c(x) is the value of the x-th target layer at (i,j).
source
# BiodiversityObservationNetworks.entropize!
\u2014 Function.
entropize!(U::Matrix{AbstractFloat}, A::Matrix{Number})\n
This function turns a matrix A
(storing measurement values) into pixel-wise entropy values, stored in a matrix U
(that is previously allocated).
Pixel-wise entropy is determined by measuring the empirical probability of randomly picking a value in the matrix that is either lower or higher than the pixel value. The entropy of both these probabilities are calculated using the -p\u00d7log(2,p) formula. The entropy of the pixel is the sum of the two entropies, so that it is close to 1 for values close to the median, and close to 0 for values close to the extreme of the distribution.
source
# BiodiversityObservationNetworks.entropize
\u2014 Function.
entropize(A::Matrix{Number})\n
Allocation version of entropize!
.
source
"},{"location":"bibliography/","title":"Bibliography","text":""},{"location":"bibliography/#references","title":"References","text":""},{"location":"vignettes/entropize/","title":"Entropize","text":""},{"location":"vignettes/entropize/#getting-the-entropy-matrix","title":"Getting the entropy matrix","text":"For some applications, we want to place points to capture the maximum amount of information, which is to say that we want to sample a balance of entropy values, as opposed to absolute values. In this vignette, we will walk through an example using the entropize
function to convert raw data to entropy values.
using BiodiversityObservationNetworks\nusing NeutralLandscapes\nusing CairoMakie\n
Entropy is problem-specific
The solution presented in this vignette is a least-assumption solution based on the empirical values given in a matrix of measurements. In a lot of situations, this is not the entropy that you want. For example, if your pixels are storing probabilities of Bernoulli events, you can directly use the entropy of the events in the entropy matrix.
We start by generating a random matrix of measurements:
measurements = rand(MidpointDisplacement(), (200, 200)) .* 100\nheatmap(measurements)\n
Using the entropize
function will convert these values into entropy at the pixel scale:
U = entropize(measurements)\nheatmap(U)\n
The values closest to the median of the distribution have the highest entropy, and the values closest to its extrema have an entropy of 0. The entropy matrix is guaranteed to have values on the unit interval.
We can use entropize
as part of a pipeline, and overlay the points optimized based on entropy on the measurement map:
locations =\n measurements |> entropize |> seed(BalancedAcceptance(; numpoints = 100)) |> first\nheatmap(U)\n
"},{"location":"vignettes/overview/","title":"Overview","text":""},{"location":"vignettes/overview/#an-introduction-to-biodiversityobservationnetworks","title":"An introduction to BiodiversityObservationNetworks","text":"In this vignette, we will walk through the basic functionalities of the package, by generating a random uncertainty matrix, and then using a seeder and a refiner to decide which locations should be sampled in order to gain more insights about the process generating this entropy.
using BiodiversityObservationNetworks\nusing NeutralLandscapes\nusing CairoMakie\n
In order to simplify the process, we will use the NeutralLandscapes package to generate a 100\u00d7100 pixels landscape, where each cell represents the entropy (or information content) in a unit we can sample:
U = rand(MidpointDisplacement(0.5), (100, 100))\nheatmap(U)\n
In practice, this uncertainty matrix is likely to be derived from an application of the hyper-parameters optimization step, which is detailed in other vignettes.
The first step of defining a series of locations to sample is to use a BONSeeder
, which will generate a number of relatively coarse proposals that cover the entire landscape, and have a balanced distribution in space. We do so using the BalancedAcceptance
sampler, which can be tweaked to capture more (or less) uncertainty. To start with, we will extract 200 candidate points, i.e. 200 possible locations which will then be refined.
pack = seed(BalancedAcceptance(; numpoints = 200), U);\n
(CartesianIndex[CartesianIndex(81, 10), CartesianIndex(19, 44), CartesianIndex(69, 77), CartesianIndex(44, 21), CartesianIndex(94, 55), CartesianIndex(12, 88), CartesianIndex(62, 33), CartesianIndex(37, 66), CartesianIndex(87, 99), CartesianIndex(25, 4) \u2026 CartesianIndex(55, 44), CartesianIndex(30, 77), CartesianIndex(80, 22), CartesianIndex(18, 55), CartesianIndex(68, 89), CartesianIndex(43, 33), CartesianIndex(93, 66), CartesianIndex(12, 100), CartesianIndex(62, 1), CartesianIndex(37, 35)], [0.6457023076171001 0.6530873362023615 \u2026 0.4802747697941122 0.495825668828999; 0.6662755323284345 0.6646109378460477 \u2026 0.4476078092501954 0.5244434382647728; \u2026 ; 0.20054927953972762 0.19293310548992137 \u2026 0.4423561795343221 0.45279935857264714; 0.2781285320931762 0.2056391355970853 \u2026 0.4695128732448035 0.41903900630023805])\n
The output of a BONSampler
(whether at the seeding or refinement step) is always a tuple, storing in the first position a vector of CartesianIndex
elements, and in the second position the matrix given as input. We can have a look at the first five points:
first(pack)[1:5]\n
5-element Vector{CartesianIndex}:\n CartesianIndex(81, 10)\n CartesianIndex(19, 44)\n CartesianIndex(69, 77)\n CartesianIndex(44, 21)\n CartesianIndex(94, 55)\n
Although returning the input matrix may seem redundant, it actually allows to chain samplers together to build pipelines that take a matrix as input, and return a set of places to sample as outputs; an example is given below.
The positions of locations to sample are given as a vector of CartesianIndex
, which are coordinates in the uncertainty matrix. Once we have generated a candidate proposal, we can further refine it using a BONRefiner
\u2013 in this case, AdaptiveSpatial
, which performs adaptive spatial sampling (maximizing the distribution of entropy while minimizing spatial auto-correlation).
candidates, uncertainty = pack\nlocations, _ = refine(candidates, AdaptiveSpatial(; numpoints = 50), uncertainty)\nlocations[1:5]\n
5-element Vector{CartesianIndex}:\n CartesianIndex(10, 1)\n CartesianIndex(3, 44)\n CartesianIndex(1, 49)\n CartesianIndex(4, 50)\n CartesianIndex(8, 54)\n
The reason we start from a candidate set of points is that some algorithms struggle with full landscapes, and work much better with a sub-sample of them. There is no hard rule (or no heuristic) to get a sense for how many points should be generated at the seeding step, and so experimentation is a must!
The previous code examples used a version of the seed
and refine
functions that is very useful if you want to change arguments between steps, or examine the content of the candidate pool of points. In addition to this syntax, both functions have a curried version that allows chaining them together using pipes (|>
):
locations =\n U |>\n seed(BalancedAcceptance(; numpoints = 200)) |>\n refine(AdaptiveSpatial(; numpoints = 50)) |>\n first\n
50-element Vector{CartesianIndex}:\n CartesianIndex(1, 60)\n CartesianIndex(25, 50)\n CartesianIndex(4, 62)\n CartesianIndex(28, 47)\n CartesianIndex(27, 43)\n CartesianIndex(32, 49)\n CartesianIndex(34, 53)\n CartesianIndex(8, 65)\n CartesianIndex(39, 52)\n CartesianIndex(41, 54)\n \u22ee\n CartesianIndex(67, 9)\n CartesianIndex(42, 42)\n CartesianIndex(92, 76)\n CartesianIndex(11, 20)\n CartesianIndex(61, 53)\n CartesianIndex(36, 87)\n CartesianIndex(86, 31)\n CartesianIndex(23, 65)\n CartesianIndex(73, 98)\n
This works because seed
and refine
have curried versions that can be used directly in a pipeline. Proposed sampling locations can then be overlayed onto the original uncertainty matrix:
plt = heatmap(U)\n#scatter!(plt, [x[1] for x in locations], [x[2] for x in locations], ms=2.5, mc=:white, label=\"\")\n
"},{"location":"vignettes/uniqueness/","title":"Uniqueness.jl","text":""},{"location":"vignettes/uniqueness/#selecting-environmentally-unique-locations","title":"Selecting environmentally unique locations","text":"For some applications, we want to sample a set of locations that cover a broad range of values in environment space. Another way to rephrase this problem is to say we want to find the set of points with the least covariance in their environmental values.
To do this, we use a BONRefiner
called Uniqueness
. We'll start by loading the required packages.
using BiodiversityObservationNetworks\nusing SpeciesDistributionToolkit\nusing StatsBase\nusing NeutralLandscapes\nusing CairoMakie\n
\u250c Error: Error during loading of extension SDMToolkitExt of BiodiversityObservationNetworks, use `Base.retry_load_extensions()` to retry.\n\u2502 exception =\n\u2502 1-element ExceptionStack:\n\u2502 ArgumentError: Package SDMToolkitExt [75f002dd-b8c9-5168-9854-88ac03dd3cdb] is required but does not seem to be installed:\n\u2502 - Run `Pkg.instantiate()` to install all recorded dependencies.\n\u2502\n\u2502 Stacktrace:\n\u2502 [1] _require(pkg::Base.PkgId, env::Nothing)\n\u2502 @ Base ./loading.jl:1774\n\u2502 [2] _require_prelocked(uuidkey::Base.PkgId, env::Nothing)\n\u2502 @ Base ./loading.jl:1660\n\u2502 [3] _require_prelocked(uuidkey::Base.PkgId)\n\u2502 @ Base ./loading.jl:1658\n\u2502 [4] run_extension_callbacks(extid::Base.ExtensionId)\n\u2502 @ Base ./loading.jl:1255\n\u2502 [5] run_extension_callbacks(pkgid::Base.PkgId)\n\u2502 @ Base ./loading.jl:1290\n\u2502 [6] run_package_callbacks(modkey::Base.PkgId)\n\u2502 @ Base ./loading.jl:1124\n\u2502 [7] _require_prelocked(uuidkey::Base.PkgId, env::String)\n\u2502 @ Base ./loading.jl:1667\n\u2502 [8] macro expansion\n\u2502 @ ./loading.jl:1648 [inlined]\n\u2502 [9] macro expansion\n\u2502 @ ./lock.jl:267 [inlined]\n\u2502 [10] require(into::Module, mod::Symbol)\n\u2502 @ Base ./loading.jl:1611\n\u2502 [11] eval\n\u2502 @ ./boot.jl:370 [inlined]\n\u2502 [12] #17\n\u2502 @ ~/.julia/packages/Documenter/bYYzK/src/Expanders.jl:629 [inlined]\n\u2502 [13] cd(f::Documenter.Expanders.var\"#17#19\"{Module, Expr}, dir::String)\n\u2502 @ Base.Filesystem ./file.jl:112\n\u2502 [14] (::Documenter.Expanders.var\"#16#18\"{Documenter.Documents.Page, Module, Expr})()\n\u2502 @ Documenter.Expanders ~/.julia/packages/Documenter/bYYzK/src/Expanders.jl:628\n\u2502 [15] (::IOCapture.var\"#4#7\"{DataType, Documenter.Expanders.var\"#16#18\"{Documenter.Documents.Page, Module, Expr}, IOContext{Base.PipeEndpoint}, IOContext{Base.PipeEndpoint}, Base.PipeEndpoint, Base.PipeEndpoint})()\n\u2502 @ IOCapture ~/.julia/packages/IOCapture/Rzdxd/src/IOCapture.jl:161\n\u2502 [16] with_logstate(f::Function, logstate::Any)\n\u2502 @ Base.CoreLogging ./logging.jl:514\n\u2502 [17] with_logger\n\u2502 @ ./logging.jl:626 [inlined]\n\u2502 [18] capture(f::Documenter.Expanders.var\"#16#18\"{Documenter.Documents.Page, Module, Expr}; rethrow::Type, color::Bool, passthrough::Bool, capture_buffer::IOBuffer)\n\u2502 @ IOCapture ~/.julia/packages/IOCapture/Rzdxd/src/IOCapture.jl:158\n\u2502 [19] runner(#unused#::Type{Documenter.Expanders.ExampleBlocks}, x::Markdown.Code, page::Documenter.Documents.Page, doc::Documenter.Documents.Document)\n\u2502 @ Documenter.Expanders ~/.julia/packages/Documenter/bYYzK/src/Expanders.jl:627\n\u2502 [20] dispatch(::Type{Documenter.Expanders.ExpanderPipeline}, ::Markdown.Code, ::Vararg{Any})\n\u2502 @ Documenter.Utilities.Selectors ~/.julia/packages/Documenter/bYYzK/src/Utilities/Selectors.jl:170\n\u2502 [21] expand(doc::Documenter.Documents.Document)\n\u2502 @ Documenter.Expanders ~/.julia/packages/Documenter/bYYzK/src/Expanders.jl:42\n\u2502 [22] runner(#unused#::Type{Documenter.Builder.ExpandTemplates}, doc::Documenter.Documents.Document)\n\u2502 @ Documenter.Builder ~/.julia/packages/Documenter/bYYzK/src/Builder.jl:226\n\u2502 [23] dispatch(#unused#::Type{Documenter.Builder.DocumentPipeline}, x::Documenter.Documents.Document)\n\u2502 @ Documenter.Utilities.Selectors ~/.julia/packages/Documenter/bYYzK/src/Utilities/Selectors.jl:170\n\u2502 [24] #2\n\u2502 @ ~/.julia/packages/Documenter/bYYzK/src/Documenter.jl:273 [inlined]\n\u2502 [25] cd(f::Documenter.var\"#2#3\"{Documenter.Documents.Document}, dir::String)\n\u2502 @ Base.Filesystem ./file.jl:112\n\u2502 [26] #makedocs#1\n\u2502 @ ~/.julia/packages/Documenter/bYYzK/src/Documenter.jl:272 [inlined]\n\u2502 [27] top-level scope\n\u2502 @ ~/work/BiodiversityObservationNetworks.jl/BiodiversityObservationNetworks.jl/docs/make.jl:10\n\u2502 [28] include(mod::Module, _path::String)\n\u2502 @ Base ./Base.jl:457\n\u2502 [29] exec_options(opts::Base.JLOptions)\n\u2502 @ Base ./client.jl:307\n\u2502 [30] _start()\n\u2502 @ Base ./client.jl:522\n\u2514 @ Base loading.jl:1261\n[ Info: Loading NeutralLandscapes support for SimpleSDMLayers.jl...\n
Consider setting your SDMLAYERS_PATH
When accessing data using SimpleSDMDatasets.jl
, it is best to set the SDM_LAYERSPATH
environmental variable to tell SimpleSDMDatasets.jl
where to download data. This can be done by setting ENV[\"SDMLAYERS_PATH\"] = \"/home/user/Data/\"
or similar in the ~/.julia/etc/julia/startup.jl
file. (Note this will be different depending on where julia
is installed.)
bbox = (left=-83.0, bottom=46.4, right=-55.2, top=63.7);\ntemp, precip, elevation =\n convert(Float32, SimpleSDMPredictor(RasterData(WorldClim2, AverageTemperature); bbox...)),\n convert(Float32, SimpleSDMPredictor(RasterData(WorldClim2, Precipitation); bbox...)),\n convert(Float32, SimpleSDMPredictor(RasterData(WorldClim2, Elevation); bbox...));\n
(SDM response \u2192 105\u00d7167 grid with 11478 Float32-valued cells, SDM response \u2192 105\u00d7167 grid with 11478 Float32-valued cells, SDM response \u2192 105\u00d7167 grid with 11478 Float32-valued cells)\n
Now we'll use the stack
function to combine our four environmental layers into a single, 3-dimensional array, which we'll pass to our Uniqueness
refiner.
```@example 1 layers = BiodiversityObservationNetworks.stack([temp,precip,elevation]);
this requires NeutralLandscapes v0.1.2\n\n\n```julia\nuncert = rand(MidpointDisplacement(0.8), size(temp), mask=temp);\nheatmap(uncert)\n
Now we'll get a set of candidate points from a BalancedAcceptance seeder that has no bias toward higher uncertainty values.
```@example 1 candpts, uncert = uncert |> seed(BalancedAcceptance(numpoints=100, \u03b1=0.0));
Now we'll `refine` our `100` candidate points down to the 30 most environmentally unique.\n\n\n```@example 1\nfinalpts, uncert = refine(candpts, Uniqueness(;numpoints=30, layers=layers), uncert)\n#=\nheatmap(uncert)\nscatter!([p[2] for p in candpts], [p[1] for p in candpts], fa=0.0, msc=:white, label=\"Candidate Points\")\nscatter!([p[2] for p in finalpts], [p[1] for p in finalpts], c=:dodgerblue, msc=:white, label=\"Selected Points\")=#\n
"}]}
\ No newline at end of file
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Home","text":""},{"location":"#biodiversityobservationnetworksjl","title":"BiodiversityObservationNetworks.jl","text":"The purpose of this package is to provide a high-level, extensible, modular interface to the selection of sampling point for biodiversity processes in space. It is based around a collection of types representing point selection algorithms, used to select the most informative sampling points based on raster data. Specifically, many algorithms work from a layer indicating entropy of a model based prediction at each location.
This package is in development
The BiodiversityObservationNetworks.jl
package is currently under development. The API is not expected to change a lot, but it may change in order to facilitate the integration of new features.
"},{"location":"#high-level-types","title":"High-level types","text":"# BiodiversityObservationNetworks.BONSampler
\u2014 Type.
BONSampler\n
A union of the abstract types BONSeeder
and BONRefiner
. Both types return a tuple with the coordinates as a vector of CartesianIndex
, and the weight matrix as a Matrix
of AbstractFloat
, in that order.
source
# BiodiversityObservationNetworks.BONSeeder
\u2014 Type.
abstract type BONSeeder end\n
A BONSeeder
is an algorithm for proposing sampling locations using a raster of weights, represented as a matrix, in each cell.
source
# BiodiversityObservationNetworks.BONRefiner
\u2014 Type.
abstract type BONRefiner end\n
A BONRefiner
is an algorithm for proposing sampling locations by refining a set of candidate points to a smaller set of 'best' points.
source
"},{"location":"#seeder-and-refiner-functions","title":"Seeder and refiner functions","text":"# BiodiversityObservationNetworks.seed
\u2014 Function.
seed(sampler::ST, uncertainty::Matrix{T})\n
Produces a set of candidate sampling locations in a vector coords
of length numpoints from a raster uncertainty
using sampler
, where sampler
is a BONSeeder
.
source
seed!(coords::Vector{CartesianIndex}, sampler::ST)\n
The curried version of seed!
, which returns a function that acts on the input uncertainty layer passed to the curried function (u
below).
source
# BiodiversityObservationNetworks.seed!
\u2014 Function.
seed!(coords::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})\n
Puts a set of candidate sampling locations in the preallocated vector coords
from a raster uncertainty
using sampler
, where sampler
is a BONSeeder
.
- Seeder's work on rasters, refiners work on set of coordinates.
source
seed!(coords::Vector{CartesianIndex}, sampler::ST)\n
The curried version of seed!
, which returns a function that acts on the input uncertainty layer passed to the curried function (u
below).
source
# BiodiversityObservationNetworks.refine
\u2014 Function.
refine(pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})\n
Refines a set of candidate sampling locations and returns a vector coords
of length numpoints from a vector of coordinates pool
using sampler
, where sampler
is a BONRefiner
.
source
refine(sampler::BONRefiner)\n
Returns a curried function of refine
with two methods: both are using the output of seed
, one in its packed form, the other in its splatted form.
source
refine(pack, sampler::BONRefiner)\n
Calls refine
on the appropriatedly splatted version of pack
.
source
# BiodiversityObservationNetworks.refine!
\u2014 Function.
refine!(cooords::Vector{CartesianIndex}, pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})\n
Refines a set of candidate sampling locations in the preallocated vector coords
from a vector of coordinates pool
using sampler
, where sampler
is a BONRefiner
.
source
refine!(cooords::Vector{CartesianIndex}, pool::Vector{CartesianIndex}, sampler::ST, uncertainty::Matrix{T})\n
The curried version of refine!
, which returns a function that acts on the input coordinate pool passed to the curried function (p
below).
source
"},{"location":"#seeder-algorithms","title":"Seeder algorithms","text":"# BiodiversityObservationNetworks.BalancedAcceptance
\u2014 Type.
BalancedAcceptance\n
A BONSeeder
that uses Balanced-Acceptance Sampling (Van-dem-Bates et al. 2017 https://doi.org/10.1111/2041-210X.13003)
source
"},{"location":"#refiner-algorithms","title":"Refiner algorithms","text":"# BiodiversityObservationNetworks.AdaptiveSpatial
\u2014 Type.
AdaptiveSpatial\n
...
numpoints, an Integer (def. 50), specifying the number of points to use.
\u03b1, an AbstractFloat (def. 1.0), specifying ...
source
# BiodiversityObservationNetworks.Uniqueness
\u2014 Type.
Uniqueness\n
A BONRefiner
source
"},{"location":"#helper-functions","title":"Helper functions","text":"# BiodiversityObservationNetworks.squish
\u2014 Function.
squish(layers, W, \u03b1)\n
Takes a set of n
layers and squishes them down to a single layer.
numcolumns = size(W,2)\n for i in 1:numcolumns\n W[:,i] ./= sum(W[:,i])\n end\n
For a coordinate in the raster (i,j), denote the vector of values across all locations at that coordinate v\u20d7\u1d62\u2c7c. The value at that coordinate in squished layer, s\u20d7\u1d62\u2c7c, is computed in two steps.
(1): First we apply a weights matrix, W``, with
nrows and
mcolumns (
m<
n), to reduce the initial
nlayers down to a set of
m` layers, each of which corresponds to a particular target of optimization. For example, we may want to propose sampling locations that are optimized to best sample abalance multiple criteria, like (a) the current distribution of a species and (b) if that distribution is changing over time.
Each entry in the weights matrix W
corresponds to the 'importance' of the layer in the corresponding row to the successful measurement of the target of the corresponding column. As such, each column of W
must sum to 1.0. using Optim
For each location, the value of the condensed layer t\u1d62
, corresponding to target i
, at coordinate (i,j) is given by the dot product of v\u20d7\u1d62\u2c7c and the i
-th column of W
.
(2): Apply a weighted average across each target layer. To produce the final output layer, we apply a weighted average to each target layer, where the weights are provided in the vector \u03b1\u20d7 of length m
.
The final value of the squished layer at (i,j) is given by s\u20d7\u1d62\u2c7c = \u2211\u2093 \u03b1\u2093*t\u1d62\u2c7c(x), where t\u1d62\u2c7c(x) is the value of the x-th target layer at (i,j).
source
# BiodiversityObservationNetworks.entropize!
\u2014 Function.
entropize!(U::Matrix{AbstractFloat}, A::Matrix{Number})\n
This function turns a matrix A
(storing measurement values) into pixel-wise entropy values, stored in a matrix U
(that is previously allocated).
Pixel-wise entropy is determined by measuring the empirical probability of randomly picking a value in the matrix that is either lower or higher than the pixel value. The entropy of both these probabilities are calculated using the -p\u00d7log(2,p) formula. The entropy of the pixel is the sum of the two entropies, so that it is close to 1 for values close to the median, and close to 0 for values close to the extreme of the distribution.
source
# BiodiversityObservationNetworks.entropize
\u2014 Function.
entropize(A::Matrix{Number})\n
Allocation version of entropize!
.
source
"},{"location":"bibliography/","title":"Bibliography","text":""},{"location":"bibliography/#references","title":"References","text":""},{"location":"vignettes/entropize/","title":"Entropize","text":""},{"location":"vignettes/entropize/#getting-the-entropy-matrix","title":"Getting the entropy matrix","text":"For some applications, we want to place points to capture the maximum amount of information, which is to say that we want to sample a balance of entropy values, as opposed to absolute values. In this vignette, we will walk through an example using the entropize
function to convert raw data to entropy values.
using BiodiversityObservationNetworks\nusing NeutralLandscapes\nusing CairoMakie\n
Entropy is problem-specific
The solution presented in this vignette is a least-assumption solution based on the empirical values given in a matrix of measurements. In a lot of situations, this is not the entropy that you want. For example, if your pixels are storing probabilities of Bernoulli events, you can directly use the entropy of the events in the entropy matrix.
We start by generating a random matrix of measurements:
measurements = rand(MidpointDisplacement(), (200, 200)) .* 100\nheatmap(measurements)\n
Using the entropize
function will convert these values into entropy at the pixel scale:
U = entropize(measurements)\nheatmap(U)\n
The values closest to the median of the distribution have the highest entropy, and the values closest to its extrema have an entropy of 0. The entropy matrix is guaranteed to have values on the unit interval.
We can use entropize
as part of a pipeline, and overlay the points optimized based on entropy on the measurement map:
locations =\n measurements |> entropize |> seed(BalancedAcceptance(; numpoints = 100)) |> first\nheatmap(U)\n
"},{"location":"vignettes/overview/","title":"Overview","text":""},{"location":"vignettes/overview/#an-introduction-to-biodiversityobservationnetworks","title":"An introduction to BiodiversityObservationNetworks","text":"In this vignette, we will walk through the basic functionalities of the package, by generating a random uncertainty matrix, and then using a seeder and a refiner to decide which locations should be sampled in order to gain more insights about the process generating this entropy.
using BiodiversityObservationNetworks\nusing NeutralLandscapes\nusing CairoMakie\n
In order to simplify the process, we will use the NeutralLandscapes package to generate a 100\u00d7100 pixels landscape, where each cell represents the entropy (or information content) in a unit we can sample:
U = rand(MidpointDisplacement(0.5), (100, 100))\nheatmap(U)\n
In practice, this uncertainty matrix is likely to be derived from an application of the hyper-parameters optimization step, which is detailed in other vignettes.
The first step of defining a series of locations to sample is to use a BONSeeder
, which will generate a number of relatively coarse proposals that cover the entire landscape, and have a balanced distribution in space. We do so using the BalancedAcceptance
sampler, which can be tweaked to capture more (or less) uncertainty. To start with, we will extract 200 candidate points, i.e. 200 possible locations which will then be refined.
pack = seed(BalancedAcceptance(; numpoints = 200), U);\n
(CartesianIndex[CartesianIndex(8, 96), CartesianIndex(58, 11), CartesianIndex(33, 44), CartesianIndex(83, 77), CartesianIndex(20, 22), CartesianIndex(70, 55), CartesianIndex(45, 88), CartesianIndex(95, 33), CartesianIndex(4, 66), CartesianIndex(54, 100) \u2026 CartesianIndex(38, 8), CartesianIndex(88, 41), CartesianIndex(7, 75), CartesianIndex(57, 19), CartesianIndex(32, 52), CartesianIndex(82, 86), CartesianIndex(20, 30), CartesianIndex(70, 64), CartesianIndex(45, 97), CartesianIndex(95, 2)], [0.4863215941562187 0.5474279971396894 \u2026 0.36580931883696505 0.27899156740470715; 0.4513559228215274 0.5185207622794403 \u2026 0.34948599040746564 0.2581362551882851; \u2026 ; 0.09279607503606678 0.07077776915686923 \u2026 0.2161624071116984 0.3094713783786546; 0.08152059396290053 0.0275472787091889 \u2026 0.22023075038227477 0.32201840163660145])\n
The output of a BONSampler
(whether at the seeding or refinement step) is always a tuple, storing in the first position a vector of CartesianIndex
elements, and in the second position the matrix given as input. We can have a look at the first five points:
first(pack)[1:5]\n
5-element Vector{CartesianIndex}:\n CartesianIndex(8, 96)\n CartesianIndex(58, 11)\n CartesianIndex(33, 44)\n CartesianIndex(83, 77)\n CartesianIndex(20, 22)\n
Although returning the input matrix may seem redundant, it actually allows to chain samplers together to build pipelines that take a matrix as input, and return a set of places to sample as outputs; an example is given below.
The positions of locations to sample are given as a vector of CartesianIndex
, which are coordinates in the uncertainty matrix. Once we have generated a candidate proposal, we can further refine it using a BONRefiner
\u2013 in this case, AdaptiveSpatial
, which performs adaptive spatial sampling (maximizing the distribution of entropy while minimizing spatial auto-correlation).
candidates, uncertainty = pack\nlocations, _ = refine(candidates, AdaptiveSpatial(; numpoints = 50), uncertainty)\nlocations[1:5]\n
5-element Vector{CartesianIndex}:\n CartesianIndex(94, 39)\n CartesianIndex(95, 33)\n CartesianIndex(92, 30)\n CartesianIndex(99, 36)\n CartesianIndex(87, 31)\n
The reason we start from a candidate set of points is that some algorithms struggle with full landscapes, and work much better with a sub-sample of them. There is no hard rule (or no heuristic) to get a sense for how many points should be generated at the seeding step, and so experimentation is a must!
The previous code examples used a version of the seed
and refine
functions that is very useful if you want to change arguments between steps, or examine the content of the candidate pool of points. In addition to this syntax, both functions have a curried version that allows chaining them together using pipes (|>
):
locations =\n U |>\n seed(BalancedAcceptance(; numpoints = 200)) |>\n refine(AdaptiveSpatial(; numpoints = 50)) |>\n first\n
50-element Vector{CartesianIndex}:\n CartesianIndex(99, 41)\n CartesianIndex(90, 36)\n CartesianIndex(97, 43)\n CartesianIndex(87, 36)\n CartesianIndex(94, 46)\n CartesianIndex(83, 38)\n CartesianIndex(93, 31)\n CartesianIndex(89, 51)\n CartesianIndex(92, 53)\n CartesianIndex(94, 56)\n \u22ee\n CartesianIndex(62, 98)\n CartesianIndex(37, 3)\n CartesianIndex(25, 69)\n CartesianIndex(75, 14)\n CartesianIndex(50, 47)\n CartesianIndex(100, 80)\n CartesianIndex(2, 25)\n CartesianIndex(52, 58)\n CartesianIndex(27, 91)\n
This works because seed
and refine
have curried versions that can be used directly in a pipeline. Proposed sampling locations can then be overlayed onto the original uncertainty matrix:
plt = heatmap(U)\n#scatter!(plt, [x[1] for x in locations], [x[2] for x in locations], ms=2.5, mc=:white, label=\"\")\n
"},{"location":"vignettes/uniqueness/","title":"Uniqueness.jl","text":""},{"location":"vignettes/uniqueness/#selecting-environmentally-unique-locations","title":"Selecting environmentally unique locations","text":"For some applications, we want to sample a set of locations that cover a broad range of values in environment space. Another way to rephrase this problem is to say we want to find the set of points with the least covariance in their environmental values.
To do this, we use a BONRefiner
called Uniqueness
. We'll start by loading the required packages.
using BiodiversityObservationNetworks\nusing SpeciesDistributionToolkit\nusing StatsBase\nusing NeutralLandscapes\nusing CairoMakie\n
\u250c Error: Error during loading of extension SDMToolkitExt of BiodiversityObservationNetworks, use `Base.retry_load_extensions()` to retry.\n\u2502 exception =\n\u2502 1-element ExceptionStack:\n\u2502 ArgumentError: Package SDMToolkitExt [75f002dd-b8c9-5168-9854-88ac03dd3cdb] is required but does not seem to be installed:\n\u2502 - Run `Pkg.instantiate()` to install all recorded dependencies.\n\u2502\n\u2502 Stacktrace:\n\u2502 [1] _require(pkg::Base.PkgId, env::Nothing)\n\u2502 @ Base ./loading.jl:1774\n\u2502 [2] _require_prelocked(uuidkey::Base.PkgId, env::Nothing)\n\u2502 @ Base ./loading.jl:1660\n\u2502 [3] _require_prelocked(uuidkey::Base.PkgId)\n\u2502 @ Base ./loading.jl:1658\n\u2502 [4] run_extension_callbacks(extid::Base.ExtensionId)\n\u2502 @ Base ./loading.jl:1255\n\u2502 [5] run_extension_callbacks(pkgid::Base.PkgId)\n\u2502 @ Base ./loading.jl:1290\n\u2502 [6] run_package_callbacks(modkey::Base.PkgId)\n\u2502 @ Base ./loading.jl:1124\n\u2502 [7] _require_prelocked(uuidkey::Base.PkgId, env::String)\n\u2502 @ Base ./loading.jl:1667\n\u2502 [8] macro expansion\n\u2502 @ ./loading.jl:1648 [inlined]\n\u2502 [9] macro expansion\n\u2502 @ ./lock.jl:267 [inlined]\n\u2502 [10] require(into::Module, mod::Symbol)\n\u2502 @ Base ./loading.jl:1611\n\u2502 [11] eval\n\u2502 @ ./boot.jl:370 [inlined]\n\u2502 [12] #17\n\u2502 @ ~/.julia/packages/Documenter/bYYzK/src/Expanders.jl:629 [inlined]\n\u2502 [13] cd(f::Documenter.Expanders.var\"#17#19\"{Module, Expr}, dir::String)\n\u2502 @ Base.Filesystem ./file.jl:112\n\u2502 [14] (::Documenter.Expanders.var\"#16#18\"{Documenter.Documents.Page, Module, Expr})()\n\u2502 @ Documenter.Expanders ~/.julia/packages/Documenter/bYYzK/src/Expanders.jl:628\n\u2502 [15] (::IOCapture.var\"#4#7\"{DataType, Documenter.Expanders.var\"#16#18\"{Documenter.Documents.Page, Module, Expr}, IOContext{Base.PipeEndpoint}, IOContext{Base.PipeEndpoint}, Base.PipeEndpoint, Base.PipeEndpoint})()\n\u2502 @ IOCapture ~/.julia/packages/IOCapture/Rzdxd/src/IOCapture.jl:161\n\u2502 [16] with_logstate(f::Function, logstate::Any)\n\u2502 @ Base.CoreLogging ./logging.jl:514\n\u2502 [17] with_logger\n\u2502 @ ./logging.jl:626 [inlined]\n\u2502 [18] capture(f::Documenter.Expanders.var\"#16#18\"{Documenter.Documents.Page, Module, Expr}; rethrow::Type, color::Bool, passthrough::Bool, capture_buffer::IOBuffer)\n\u2502 @ IOCapture ~/.julia/packages/IOCapture/Rzdxd/src/IOCapture.jl:158\n\u2502 [19] runner(#unused#::Type{Documenter.Expanders.ExampleBlocks}, x::Markdown.Code, page::Documenter.Documents.Page, doc::Documenter.Documents.Document)\n\u2502 @ Documenter.Expanders ~/.julia/packages/Documenter/bYYzK/src/Expanders.jl:627\n\u2502 [20] dispatch(::Type{Documenter.Expanders.ExpanderPipeline}, ::Markdown.Code, ::Vararg{Any})\n\u2502 @ Documenter.Utilities.Selectors ~/.julia/packages/Documenter/bYYzK/src/Utilities/Selectors.jl:170\n\u2502 [21] expand(doc::Documenter.Documents.Document)\n\u2502 @ Documenter.Expanders ~/.julia/packages/Documenter/bYYzK/src/Expanders.jl:42\n\u2502 [22] runner(#unused#::Type{Documenter.Builder.ExpandTemplates}, doc::Documenter.Documents.Document)\n\u2502 @ Documenter.Builder ~/.julia/packages/Documenter/bYYzK/src/Builder.jl:226\n\u2502 [23] dispatch(#unused#::Type{Documenter.Builder.DocumentPipeline}, x::Documenter.Documents.Document)\n\u2502 @ Documenter.Utilities.Selectors ~/.julia/packages/Documenter/bYYzK/src/Utilities/Selectors.jl:170\n\u2502 [24] #2\n\u2502 @ ~/.julia/packages/Documenter/bYYzK/src/Documenter.jl:273 [inlined]\n\u2502 [25] cd(f::Documenter.var\"#2#3\"{Documenter.Documents.Document}, dir::String)\n\u2502 @ Base.Filesystem ./file.jl:112\n\u2502 [26] #makedocs#1\n\u2502 @ ~/.julia/packages/Documenter/bYYzK/src/Documenter.jl:272 [inlined]\n\u2502 [27] top-level scope\n\u2502 @ ~/work/BiodiversityObservationNetworks.jl/BiodiversityObservationNetworks.jl/docs/make.jl:10\n\u2502 [28] include(mod::Module, _path::String)\n\u2502 @ Base ./Base.jl:457\n\u2502 [29] exec_options(opts::Base.JLOptions)\n\u2502 @ Base ./client.jl:307\n\u2502 [30] _start()\n\u2502 @ Base ./client.jl:522\n\u2514 @ Base loading.jl:1261\n[ Info: Loading NeutralLandscapes support for SimpleSDMLayers.jl...\n
Consider setting your SDMLAYERS_PATH
When accessing data using SimpleSDMDatasets.jl
, it is best to set the SDM_LAYERSPATH
environmental variable to tell SimpleSDMDatasets.jl
where to download data. This can be done by setting ENV[\"SDMLAYERS_PATH\"] = \"/home/user/Data/\"
or similar in the ~/.julia/etc/julia/startup.jl
file. (Note this will be different depending on where julia
is installed.)
bbox = (left=-83.0, bottom=46.4, right=-55.2, top=63.7);\ntemp, precip, elevation =\n convert(Float32, SimpleSDMPredictor(RasterData(WorldClim2, AverageTemperature); bbox...)),\n convert(Float32, SimpleSDMPredictor(RasterData(WorldClim2, Precipitation); bbox...)),\n convert(Float32, SimpleSDMPredictor(RasterData(WorldClim2, Elevation); bbox...));\n
(SDM response \u2192 105\u00d7167 grid with 11478 Float32-valued cells, SDM response \u2192 105\u00d7167 grid with 11478 Float32-valued cells, SDM response \u2192 105\u00d7167 grid with 11478 Float32-valued cells)\n
Now we'll use the stack
function to combine our four environmental layers into a single, 3-dimensional array, which we'll pass to our Uniqueness
refiner.
```@example 1 layers = BiodiversityObservationNetworks.stack([temp,precip,elevation]);
this requires NeutralLandscapes v0.1.2\n\n\n```julia\nuncert = rand(MidpointDisplacement(0.8), size(temp), mask=temp);\nheatmap(uncert)\n
Now we'll get a set of candidate points from a BalancedAcceptance seeder that has no bias toward higher uncertainty values.
```@example 1 candpts, uncert = uncert |> seed(BalancedAcceptance(numpoints=100, \u03b1=0.0));
Now we'll `refine` our `100` candidate points down to the 30 most environmentally unique.\n\n\n```@example 1\nfinalpts, uncert = refine(candpts, Uniqueness(;numpoints=30, layers=layers), uncert)\n#=\nheatmap(uncert)\nscatter!([p[2] for p in candpts], [p[1] for p in candpts], fa=0.0, msc=:white, label=\"Candidate Points\")\nscatter!([p[2] for p in finalpts], [p[1] for p in finalpts], c=:dodgerblue, msc=:white, label=\"Selected Points\")=#\n
"}]}
\ No newline at end of file
diff --git a/previews/PR65/vignettes/ctvidzn.png b/previews/PR65/vignettes/ctvidzn.png
new file mode 100644
index 0000000..407d98b
Binary files /dev/null and b/previews/PR65/vignettes/ctvidzn.png differ
diff --git a/previews/PR65/vignettes/dtzzbzu.png b/previews/PR65/vignettes/dtzzbzu.png
new file mode 100644
index 0000000..f0f72ed
Binary files /dev/null and b/previews/PR65/vignettes/dtzzbzu.png differ
diff --git a/previews/PR65/vignettes/entropize/index.html b/previews/PR65/vignettes/entropize/index.html
index 048514f..dbdce25 100644
--- a/previews/PR65/vignettes/entropize/index.html
+++ b/previews/PR65/vignettes/entropize/index.html
@@ -409,19 +409,19 @@ Getting the entropy matrix
measurements = rand(MidpointDisplacement(), (200, 200)) .* 100
heatmap(measurements)
-
+
Using the entropize
function will convert these values into entropy at the pixel scale:
U = entropize(measurements)
heatmap(U)
-
+
The values closest to the median of the distribution have the highest entropy, and the values closest to its extrema have an entropy of 0. The entropy matrix is guaranteed to have values on the unit interval.
We can use entropize
as part of a pipeline, and overlay the points optimized based on entropy on the measurement map:
locations =
measurements |> entropize |> seed(BalancedAcceptance(; numpoints = 100)) |> first
heatmap(U)
-
+
diff --git a/previews/PR65/vignettes/gbfwvxs.png b/previews/PR65/vignettes/gbfwvxs.png
deleted file mode 100644
index 79f334a..0000000
Binary files a/previews/PR65/vignettes/gbfwvxs.png and /dev/null differ
diff --git a/previews/PR65/vignettes/kdskdad.png b/previews/PR65/vignettes/kdskdad.png
deleted file mode 100644
index 9a21033..0000000
Binary files a/previews/PR65/vignettes/kdskdad.png and /dev/null differ
diff --git a/previews/PR65/vignettes/overview/index.html b/previews/PR65/vignettes/overview/index.html
index 573dd3f..5ecbe80 100644
--- a/previews/PR65/vignettes/overview/index.html
+++ b/previews/PR65/vignettes/overview/index.html
@@ -405,22 +405,22 @@ An introduction to B
U = rand(MidpointDisplacement(0.5), (100, 100))
heatmap(U)
-
+
In practice, this uncertainty matrix is likely to be derived from an application of the hyper-parameters optimization step, which is detailed in other vignettes.
The first step of defining a series of locations to sample is to use a BONSeeder
, which will generate a number of relatively coarse proposals that cover the entire landscape, and have a balanced distribution in space. We do so using the BalancedAcceptance
sampler, which can be tweaked to capture more (or less) uncertainty. To start with, we will extract 200 candidate points, i.e. 200 possible locations which will then be refined.
pack = seed(BalancedAcceptance(; numpoints = 200), U);
-(CartesianIndex[CartesianIndex(81, 10), CartesianIndex(19, 44), CartesianIndex(69, 77), CartesianIndex(44, 21), CartesianIndex(94, 55), CartesianIndex(12, 88), CartesianIndex(62, 33), CartesianIndex(37, 66), CartesianIndex(87, 99), CartesianIndex(25, 4) … CartesianIndex(55, 44), CartesianIndex(30, 77), CartesianIndex(80, 22), CartesianIndex(18, 55), CartesianIndex(68, 89), CartesianIndex(43, 33), CartesianIndex(93, 66), CartesianIndex(12, 100), CartesianIndex(62, 1), CartesianIndex(37, 35)], [0.6457023076171001 0.6530873362023615 … 0.4802747697941122 0.495825668828999; 0.6662755323284345 0.6646109378460477 … 0.4476078092501954 0.5244434382647728; … ; 0.20054927953972762 0.19293310548992137 … 0.4423561795343221 0.45279935857264714; 0.2781285320931762 0.2056391355970853 … 0.4695128732448035 0.41903900630023805])
+(CartesianIndex[CartesianIndex(8, 96), CartesianIndex(58, 11), CartesianIndex(33, 44), CartesianIndex(83, 77), CartesianIndex(20, 22), CartesianIndex(70, 55), CartesianIndex(45, 88), CartesianIndex(95, 33), CartesianIndex(4, 66), CartesianIndex(54, 100) … CartesianIndex(38, 8), CartesianIndex(88, 41), CartesianIndex(7, 75), CartesianIndex(57, 19), CartesianIndex(32, 52), CartesianIndex(82, 86), CartesianIndex(20, 30), CartesianIndex(70, 64), CartesianIndex(45, 97), CartesianIndex(95, 2)], [0.4863215941562187 0.5474279971396894 … 0.36580931883696505 0.27899156740470715; 0.4513559228215274 0.5185207622794403 … 0.34948599040746564 0.2581362551882851; … ; 0.09279607503606678 0.07077776915686923 … 0.2161624071116984 0.3094713783786546; 0.08152059396290053 0.0275472787091889 … 0.22023075038227477 0.32201840163660145])
The output of a BONSampler
(whether at the seeding or refinement step) is always a tuple, storing in the first position a vector of CartesianIndex
elements, and in the second position the matrix given as input. We can have a look at the first five points:
first(pack)[1:5]
5-element Vector{CartesianIndex}:
- CartesianIndex(81, 10)
- CartesianIndex(19, 44)
- CartesianIndex(69, 77)
- CartesianIndex(44, 21)
- CartesianIndex(94, 55)
+ CartesianIndex(8, 96)
+ CartesianIndex(58, 11)
+ CartesianIndex(33, 44)
+ CartesianIndex(83, 77)
+ CartesianIndex(20, 22)
Although returning the input matrix may seem redundant, it actually allows to chain samplers together to build pipelines that take a matrix as input, and return a set of places to sample as outputs; an example is given below.
The positions of locations to sample are given as a vector of CartesianIndex
, which are coordinates in the uncertainty matrix. Once we have generated a candidate proposal, we can further refine it using a BONRefiner
– in this case, AdaptiveSpatial
, which performs adaptive spatial sampling (maximizing the distribution of entropy while minimizing spatial auto-correlation).
@@ -429,11 +429,11 @@ An introduction to B
locations[1:5]
5-element Vector{CartesianIndex}:
- CartesianIndex(10, 1)
- CartesianIndex(3, 44)
- CartesianIndex(1, 49)
- CartesianIndex(4, 50)
- CartesianIndex(8, 54)
+ CartesianIndex(94, 39)
+ CartesianIndex(95, 33)
+ CartesianIndex(92, 30)
+ CartesianIndex(99, 36)
+ CartesianIndex(87, 31)
The reason we start from a candidate set of points is that some algorithms struggle with full landscapes, and work much better with a sub-sample of them. There is no hard rule (or no heuristic) to get a sense for how many points should be generated at the seeding step, and so experimentation is a must!
The previous code examples used a version of the seed
and refine
functions that is very useful if you want to change arguments between steps, or examine the content of the candidate pool of points. In addition to this syntax, both functions have a curried version that allows chaining them together using pipes (|>
):
@@ -444,32 +444,32 @@ An introduction to B
first
50-element Vector{CartesianIndex}:
- CartesianIndex(1, 60)
- CartesianIndex(25, 50)
- CartesianIndex(4, 62)
- CartesianIndex(28, 47)
- CartesianIndex(27, 43)
- CartesianIndex(32, 49)
- CartesianIndex(34, 53)
- CartesianIndex(8, 65)
- CartesianIndex(39, 52)
- CartesianIndex(41, 54)
+ CartesianIndex(99, 41)
+ CartesianIndex(90, 36)
+ CartesianIndex(97, 43)
+ CartesianIndex(87, 36)
+ CartesianIndex(94, 46)
+ CartesianIndex(83, 38)
+ CartesianIndex(93, 31)
+ CartesianIndex(89, 51)
+ CartesianIndex(92, 53)
+ CartesianIndex(94, 56)
⋮
- CartesianIndex(67, 9)
- CartesianIndex(42, 42)
- CartesianIndex(92, 76)
- CartesianIndex(11, 20)
- CartesianIndex(61, 53)
- CartesianIndex(36, 87)
- CartesianIndex(86, 31)
- CartesianIndex(23, 65)
- CartesianIndex(73, 98)
+ CartesianIndex(62, 98)
+ CartesianIndex(37, 3)
+ CartesianIndex(25, 69)
+ CartesianIndex(75, 14)
+ CartesianIndex(50, 47)
+ CartesianIndex(100, 80)
+ CartesianIndex(2, 25)
+ CartesianIndex(52, 58)
+ CartesianIndex(27, 91)
This works because seed
and refine
have curried versions that can be used directly in a pipeline. Proposed sampling locations can then be overlayed onto the original uncertainty matrix:
plt = heatmap(U)
#scatter!(plt, [x[1] for x in locations], [x[2] for x in locations], ms=2.5, mc=:white, label="")
Now we'll get a set of candidate points from a BalancedAcceptance seeder that has no bias toward higher uncertainty values.
```@example 1 candpts, uncert = uncert |> seed(BalancedAcceptance(numpoints=100, α=0.0)); diff --git a/previews/PR65/vignettes/wlpwwqe.png b/previews/PR65/vignettes/wlpwwqe.png new file mode 100644 index 0000000..6e99c83 Binary files /dev/null and b/previews/PR65/vignettes/wlpwwqe.png differ diff --git a/previews/PR65/vignettes/yvyrlqc.png b/previews/PR65/vignettes/yvyrlqc.png deleted file mode 100644 index 0c9c80e..0000000 Binary files a/previews/PR65/vignettes/yvyrlqc.png and /dev/null differ diff --git a/previews/PR65/vignettes/zhcpfbl.png b/previews/PR65/vignettes/zhcpfbl.png deleted file mode 100644 index 79f334a..0000000 Binary files a/previews/PR65/vignettes/zhcpfbl.png and /dev/null differ diff --git a/previews/PR65/vignettes/zjzmvqq.png b/previews/PR65/vignettes/zjzmvqq.png new file mode 100644 index 0000000..f0f72ed Binary files /dev/null and b/previews/PR65/vignettes/zjzmvqq.png differ diff --git a/previews/PR65/vignettes/zpglfmw.png b/previews/PR65/vignettes/zpglfmw.png deleted file mode 100644 index c1f2323..0000000 Binary files a/previews/PR65/vignettes/zpglfmw.png and /dev/null differ