Laundry list: parallelization, random effects with smooths, prediction intervals, non-convergence... #352
Replies: 8 comments 7 replies
-
Here's some responses in-line below:
Response: please check out the manual on this -- if you look at the sdmTMBcontrol() docs on p 41-42, there's example usage: https://cran.r-project.org/web/packages/sdmTMB/sdmTMB.pdf
Response: it's better to use the GLM-style notation here (1|vessel)
Response: the best approach here would be to completely start from scratch, with a spatial - only model and no main effects. Build the pieces up -- it seems like you've got a complicated model you're trying to work backwards from, and it's hard to identify what might be wrong
Response: For prediction intervals, get_index is returning confidence intervals. You could do prediction intervals manually -- the difference would be adding on observation / sampling error, and would have to be done at the level of individual grid cells in the prediction dataframe before being aggregated into a total index
Response: this is one way to get rid of the global intercept (the other being to add -1 to the formula). This isn't always useful, but helpful for example in cases where you want to include fixed year effects and make them more interpretable.
|
Beta Was this translation helpful? Give feedback.
-
Adding some additional thoughts:
Yes, e.g., Working with lower resolution meshes, especially during testing, will make a large difference. Finer isn't always better even for out of sample prediction. Also, what will make a massive difference for speed (also for mgcv) is setting up an optimized BLAS.
Yes, start with a basic model and build up. One thing to note, given you have several penalized smoothers, is that if an SD penalizing the random effect coefficients on the basis functions (
Yes, you could basically take the same approach as is in that blog post. The I'm still trying to wrap my head around exactly why you'd want to do that here though. The index isn't something you would observe itself (which is where I'd usually expect to make 'prediction intervals' to compare observations to), most assessment models you'd be entering this kind of index into would be assuming any CV or standard error would be on the mean, and because the observation errors are by definition independent, they aren't going to affect the trend, only inflate the uncertainty on the scale and therefore any catchability parameter.
Yeah, as Eric said, this is just a convenience here. As long as you have year-factors in your formula, this just forces the coefficients on those parameters to represent the mean for a given year as opposed to the intercept representing the first year and the other coefficients representing offsets from that. In the end, the model is identical. Without factor predictors, dropping the intercept will change the model. |
Beta Was this translation helpful? Give feedback.
-
I have a few more questions. I have started looking into why my models die and as soon as I add a spatio-temporal effect, I have problems. This led me to look at the meshes I am using and I am confused by the results.
mymesh = make_mesh(data |> as.data.frame(),c("lon","lat"),cutoff=10,type="cutoff") Note that I am explicitly using a low resolution mesh to start with and I am using lon,lat as spatial coordinates as the zone of the data is too large for UTM, but relatively close to the equator so distortion from true distance is small. This produced the following mesh: I "fixed" this by identifying points on the convex hull of the dataset and moving them to the top of the dataset (i.e., the first few rows). After this change, the mesh looks better, though some of the points are still very close to the mesh edge: Is this the correct approach or is there a better way to assure that the mesh covers the data?
|
Beta Was this translation helpful? Give feedback.
-
After several tests, it seems like the mesh not covering all the data was a large part of my problem. After fixing that, the model convergences with a fairly complex set of predictors. Based on these working models, I have a few additional questions (sorry to ask so many, but I am in the thick of it...):
As always, thanks for the help! |
Beta Was this translation helpful? Give feedback.
-
Hi, I imagine you have moved on to other things, but the issue with meshes not covering the data seems important to me. I have tried various ways of reordering the data and they do significantly change the resulting mesh, which seems strange to me. Is this a known issue documented somewhere? If not, then perhaps a reordering of datasets to place the convex hull points at the beginning of the dataset before running |
Beta Was this translation helpful? Give feedback.
-
Sorry I missed replying to this. Answers below:
This is all an issue with INLA, now fmesher, unfortunately, and out of my hands. The not covering the data part is not something I've commonly seen. The default
In general, that shouldn't be a major issue. The vertices become the random effects. It's OK if they don't all have data associated with them.
See this example: library(sdmTMB)
mesh <- make_mesh(pcod, c("X", "Y"), cutoff = 5, type = "cutoff")
# vertices:
head(mesh$mesh$loc[,1:2])
#> [,1] [,2]
#> [1,] 446.4752 5793.426
#> [2,] 446.4594 5800.136
#> [3,] 436.9157 5802.305
#> [4,] 420.6101 5771.055
#> [5,] 408.2088 5771.287
#> [6,] 414.3656 5760.894 loc <- mesh$mesh$loc
# every data point gets a weighted average of the nearest 3 vertices
# based on this matrix:
A <- mesh$A_st
# so, for row of data 3, bilinear interpolation is the combination
# of these vertices:
vert <- which(A[3, ] != 0)
vert
#> [1] 2 535 563 And these are the weights: A[3, vert]
#> [1] 0.6333086 0.2156787 0.1510126 plot(mesh$mesh)
points(loc[vert,], col = "blue", pch = 20)
points(pcod[3, c("X", "Y")], col = "red", pch = 20, cex = 0.8) Created on 2024-07-03 with reprex v2.1.0
Thanks, I just fixed the docs. I had to pull out that ggplot code when I removed INLA as an imported package because it relies on inlabru. There's now a note about how you can pass
There's no magic number. If that's the only issue and the gradient is still < 0.01, I probably wouldn't be too worried. The default when fitting is now sanity(fit, gradient_thresh = 0.01)
Yes...
You'll need some kind of time process if you want to estimate another parameter that changes as a factor variable between years. E.g.
If you're using OpenBLAS, I believe the default is to use all cores. You may be able to control this with: RhpcBLASctl::blas_set_num_threads(1)
RhpcBLASctl::omp_set_num_threads(1) and put that in your .Rprofile. |
Beta Was this translation helpful? Give feedback.
-
Thanks. I will have to chew on it a bit to understand the comments about the quota step function. Regarding the mesh, one potential reason that I am seeing this issue, but not commonly in other cases, is that I am using lon,lat instead of UTM coordinates, which means that my scale is different from typical scales by about a factor of 100 (I guess I could try multiplying lon,lat by 111 km to test, but I haven't done that). Perhaps this interacts with some of the defaults in the mesh generation algorithm. At some point, I tried changing the defaults regarding mesh-data distance, but didn't get it to behave the way I wanted (a somewhat larger mesh envelope). In any case, putting some of the points near the edge of the convex hull at the top of the data frame does seem to fix the issue. In my latest incarnations, I don't just use the convex hull points, but also all positions within a certain distance of the convex hull, and this produces a mesh with no points outside or on the edge of the mesh. |
Beta Was this translation helpful? Give feedback.
-
This can affect model convergence if it means the Matérn range parameter is very small (or in other cases, very big). The range parameter is distance in units of your coordinates. If the data span much wider than a UTM zone, you could use an equal area projection for your area of interest projection such as Albers in km. There's an example with Snowy Owls across North America in the preprint. There's an example in the Atlantic here. My understanding is you want lat_1 and lat_2 to be roughly 1/6 and 5/6 of the latitude range. lat_0 and lon_0 just control what is zero in the new coordinates. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I am just starting to familiarize myself with
sdmTMB
, which I am trying to use for a fishery CPUE standardization exercise (replacing the GAMM models developed here with spatio-temporal GLMM or GAMM models). I have a laundry list of basic questions, some of which are undoubtedly a bit naive, and I was wondering if other sdmTMB users could point my in the right directions. The list is below. Thanks in advance!parallelization: The models I am running involve quite a bit of data, so ideally I would use parallelization to speed them up. I asked ChatGPT how to use parallelization with
sdmTMB
and it said to executeTMB::openmp(NPROCS)
and that it would all work out. I just want to confirm that this is the correct approach?random effects: In my current GAMM models implemented with
mgcv
, I have a random effect for fishing vessel implemented withs(vessel,bs="re")
. I am wondering if this approach is still valid insdmTMB
, or should I use GLM-style random effects+(1|vessel)
?model non-convergence: So far, my tests of
sdmTMB
have not gone too well, either because the models do not pass the sanity checks, suggestingsimplifying the model, adjusting the mesh, or adding priors
or because the model errors out with singularities. Presumably the models are over determined, but there are lots of different ways I could simplify them, making it hard to know what to do next. Do you have suggestions of how to go about diagnosing these issues? E.g., what to simplify first (spatio-temporal effects, smooth effects in formula, mesh, etc.)?prediction intervals: For my CPUE indices here, I have been using prediction intervals based on the approach developed here to get a better estimate of prediction uncertainty. Am I correct to understand that
sdmTMB
does not currently include prediction of prediction intervals? Does the functionget_index
include already this sort of thing (e.g., covariance between prediction uncertainties and unexplained variance in index uncertainty)? If not, is there an equivalent oftype="lpmatrix"
in themgcv::predict.gam
function?zeroing out intercept for
get_index
: I notice than in examples usingget_index
, the intercept is zeroes out (e.g.,density ~ 0 + ...
). Why is this necessary/appropriate?Thanks a bunch for the assistance and I apologize in advance if any of these questions are off base...
Cheers,
David
Beta Was this translation helpful? Give feedback.
All reactions