sampling bias correction in Hmsc (Is it possible to provide a weight to sampling units) #196

elgabbas · 2024-07-29T14:47:40Z

Hello,

I am using Hmsc with opportunistic data (mainly from GBIF; the model does not employ trait info). As expected, the data shows signs of sampling bias. I only used presence-(?)absence information per grid cell of 10 kilometres. This spatial thinning can, to some extent, reduce the effect of sampling bias, but this is clearly not sufficient.

To correct for sampling bias, I use a layer representing sampling efforts (sampling intensity of all GBIF data of higher taxonomic level than the species of interest) as a predictor of the model. A similar method is used in many cases in the literature (including Hmsc; e.g. section 7.9 of Ovaskainen & Abrego (2020)). However, I see some concern that the predictions will reflect the sampling bias in the data rather than the predicted biological information (i.e., intensively sampled areas will be likely to be predicted with high probability; as expected, almost all species show a positive effect and positive response curve to the sampling intensity predictor). Although it is possible to use Warton et al.'s approach to correct for it by assigning all sampling units the same value representing optimum (e.g. median) sampling efforts during prediction, my previous experience with this method showed overprediction.

I question whether it is possible to consider the sampling intensity differently in Hmsc? In glm, for example, it is possible to use sampling efforts as weights to the model. It seems that this is not directly allowed using a weights argument. Is it possible to use such info as informed priors or offset? And if so, please give me a snippet of code on how this should be identified.

Thanks in advance,
Ahmed

The text was updated successfully, but these errors were encountered:

elgabbas · 2024-07-29T14:49:50Z

Here is an example data. Source

require(Hmsc)
require(MASS)
set.seed(6)
n = 100
ns = 5
beta1 = c(-2,-1,0,1,2)
alpha = rep(0,ns)
beta = cbind(alpha,beta1)
x = cbind(rep(1,n),rnorm(n))
Lf = x%*%t(beta)
xycoords = matrix(runif(2*n),ncol = 2)
colnames(xycoords) = c("x-coordinate","y-coordinate")
rownames(xycoords) = 1:n
sigma.spatial = c(2)
alpha.spatial = c(0.35)
Sigma = sigma.spatial^2*exp(-as.matrix(dist(xycoords))/alpha.spatial)
eta1 = mvrnorm(mu = rep(0,n), Sigma = Sigma)
lambda1 = c(1,2,-2,-1,0)
Lr = eta1%*%t(lambda1)
L = Lf + Lr
y = as.matrix(L + matrix(rnorm(n*ns),ncol = ns))
yprob = 1*((L +matrix(rnorm(n*ns),ncol = ns))>0)
XData = data.frame(x1 = x[,2])
studyDesign = data.frame(sample = as.factor(1:n))


studyDesign = data.frame(sample = as.factor(1:n))
rL.spatial = HmscRandomLevel(sData = xycoords)
rL.spatial = setPriors(rL.spatial,nfMin=1,nfMax=1)
m.spatial = Hmsc(Y=yprob, XData=XData, XFormula=~x1,
                    studyDesign=studyDesign, 
                    ranLevels=list("sample"=rL.spatial),distr="probit")

thin = 10
samples = 100
transient = 10
verbose = 0

m.spatial = sampleMcmc(m.spatial, thin = thin, samples = samples, transient = transient,
nChains = nChains, nParallel = nChains, verbose = verbose)

How to include the following sampling efforts vector into the model as a weight or informed prior.

set.seed(8)
SampIntensity <- abs(rnorm(nrow(xycoords), mean = 200, sd = 10))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sampling bias correction in Hmsc (Is it possible to provide a weight to sampling units) #196

sampling bias correction in Hmsc (Is it possible to provide a weight to sampling units) #196

elgabbas commented Jul 29, 2024

elgabbas commented Jul 29, 2024

sampling bias correction in Hmsc (Is it possible to provide a weight to sampling units) #196

sampling bias correction in Hmsc (Is it possible to provide a weight to sampling units) #196

Comments

elgabbas commented Jul 29, 2024

elgabbas commented Jul 29, 2024