Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predictions by prepareGradient do not match spatial latent factor #204

Open
claraqin opened this issue Oct 23, 2024 · 2 comments
Open

Predictions by prepareGradient do not match spatial latent factor #204

claraqin opened this issue Oct 23, 2024 · 2 comments

Comments

@claraqin
Copy link

claraqin commented Oct 23, 2024

Hello,

I'm trying to use prepareGradient to make spatial predictions across a low-resolution raster of near-global extent.

This could be my own misunderstanding of the principles behind prepareGradient, but I noticed that the predictions made using prepareGradient do not match the spatial distribution of the posterior mean of the (one and only) spatial latent factor in my single-species Hmsc model fitted using the GPP algorithm.

pred_otu0000108_Lr

Figure 1. The posterior mean spatial latent factor, mean(Eta) * mean(Lambda).

pred_otu0000108_randomeffectsonly_thin1

Figure 2. Predictions made by running predict on a gradient made using prepareGradient with a raster of constant values, so that only the spatial latent factor should be driving variation across pixels. (Though not relevant to the issue, the cyan points are those where this species' presence has been recorded.)

Note that even though the posterior mean spatial latent factor displays low values in Canada (Fig 1), predictions made on pixels located in Canada are unusually high (Fig 2).

Is this expected behavior? Do you have ideas about what could be causing the discrepancy?

Thanks,
Clara

@gtikhonov
Copy link
Member

This looks very odd to me, but unfortunately no informed guess so far regarding where this discrepancy can originate from. I cannot rule this possibility out, but this does not look just like the effect of whether the nonlinear transformation from EtaLambda to Normal_CDF(Const+EtaLambda) is made before/after the averaging over the posterior.
Can you share the model structure that you used and the script that you used to plot these results, basically starting from the point once you have the fitted model? Perhaps, also it is worth to plot the second plot without marine mask, just to check whether there are some more clear unexpected patterns showing up. If you are using the spherical coordinate and geodesic distances, can this get somehow confounded with GPP approximation?

@claraqin
Copy link
Author

Hi @gtikhonov

Thanks for your response.

Is the information below helpful? I won't be able to share the model file itself since it's nearly 100 MB.

Model summary:

Hmsc object with 5567 sampling units, 1 species, 21 covariates, 1 traits and 1 random levels
Posterior MCMC sampling with 2 chains each with 1000 samples, thin 300 and transient 1e+05 

Code:

m.gpp.space.post <- readRDS(filename)

r_combined_nonant <- rast("data/r_combined_nonant.tif")
new.xycoords_nonant <- crds(r_combined_nonant)
new.XData_nonant <- values(r_combined_nonant, dataframe=TRUE, na.rm=TRUE)

gradient_nonant <- prepareGradient(m.gpp.space.post, new.XData_nonant, list("sample"=new.xycoords_nonant))

postsamples <- poolMcmcChains(m.gpp.space.post$postList, thin=20)
length(postsamples) # 100 samples, though in the original post that started this thread, used all 2,000 samples

preds_nonant <- predict(m.gpp.space.post, 
                        post = postsamples, 
                        Gradient = gradient_nonant,
                        expected = TRUE)

preds_nonant_mean <- apply(abind::abind(preds_nonant, along = 3), 1, mean)

r_pred_nonant <- r_combined_nonant[[1]]
names(r_pred_nonant) <- "prob"
values(r_pred_nonant) <- NA
values(r_pred_nonant)[complete.cases(values(r_combined_nonant))] <- preds_nonant_mean

png(paste0("plots/X.png"), width=1600, height=800)
xycoords_present <- as.data.frame(m.gpp.space.post$rL[[1]]$s)[m.gpp.space.post$Y==1,]
plot(r_pred_nonant)
points(xycoords_present, col="cyan", pch=16, add=TRUE)
points(xycoords_present, col="black", pch=21, add=TRUE)
dev.off()

Perhaps, also it is worth to plot the second plot without marine mask, just to check whether there are some more clear unexpected patterns showing up.

Here's what that looks like:
Screenshot 2024-10-28 at 2 53 31 PM

Do you have suggestions about how to tell if the geodesic coordinates are getting confused by the GPP?

Thanks,
Clara

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants