Marginal survey indices? #314
-
In cases where a model has a covariate, say Is this possible using Thus far I have not managed to do this using the index functions in sdmTMB and have calculated the marginal index manually using something like the following (including center of gravity and depth): predict_output$data %>%
group_by(year, depth, X, Y) %>%
reframe(est = mean(est)) %>%
group_by(year) %>%
reframe(
Y = weighted.mean(Y, est),
X = weighted.mean(X, est),
depth = weighted.mean(depth, est),
est = sum(est)*grid_side_size^2
) And |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 4 replies
-
Gear type would usually be considered a catchability covariate and so you would predict for a single reference gear type. As long as gear type is not in the model as a spatially varying coefficient or interacting with some variable that is changing through time, the indexes that you get out of the various gear types should be the same, just shifted up or down for catchability. If the index was being used in a stock assessment, then typically you would either consider it a relative index and estimate catchability (in which case which gear type you pick shouldn't matter unless you are putting a prior on it) or you would treat it as an absolute index, in which case I assume you would want it to reflect some specific gear type that you think has 100% catchability. If you really did want to calculate an average index could you not include all gear types and then divide the total by the number of gear types? Again, I'm not sure why you would do this though and it would be much slower to calculate. Separately, it's still not clear how much we should trust the output of |
Beta Was this translation helpful? Give feedback.
-
Apologies for a long response time. Thanks for your response. I appreciate it as well as your effort with sdmTMB. This tool is taking us to the next level in stock assessment. I have multiple goals: 1) make a sdmTMB model to study the fine-scale distribution of the species over time and space in a manuscript. Here I would also 2) examine whether length binned indices correlate with fishing, climate and other fish indices, and finally 3) to make an index for the Greenland halibut assessment (which does not have to be the same than used in the manuscript). These indices will be passed into an assessment model and can be relative. I can send you the manuscript draft once I get it ready for feedback, if you wish (Eric and Jim are already on the list). This post has multiple diverging issues. Let's break them down: Marginal, conditional and summed indices Obviously, you are right, but I am just trying to wrap my head around this and hence keep on coming back to this. Using the example below, I understand now that the summed and marginal indices are simple sums and means of conditional indices, respectively. Confidence intervals differ, however, and one could not simply summate/average them (this is expected). Summed confidence intervals appear wider in the example than the ones calculated by # Index type example ####
library(sdmTMB)
library(tidyverse)
library(cowplot)
#>
#> Attaching package: 'cowplot'
#> The following object is masked from 'package:lubridate':
#>
#> stamp
tmp_dt <- pcod_2011 %>%
mutate(geartype = rep(letters[1:2], length.out = n()))
m <- sdmTMB(
density ~ 0 + geartype + poly(log(depth), 2),
data = tmp_dt,
time = "year",
mesh = pcod_mesh_2011,
family = delta_gamma()
)
nd <- qcs_grid %>%
replicate_df("year", unique(tmp_dt$year)) %>%
replicate_df("geartype", unique(tmp_dt$geartype))
p_summed <- predict(m, newdata = nd, return_tmb_object = TRUE,
type = "response")
p_conditional_a <- predict(m, newdata = nd %>% filter(geartype == "a"),
return_tmb_object = TRUE)
p_conditional_b <- predict(m, newdata = nd %>% filter(geartype == "b"),
return_tmb_object = TRUE)
indices <-
bind_rows(
get_index(p_summed, bias_correct = FALSE) %>%
mutate(type = "summed"),
get_index(p_conditional_a, bias_correct = FALSE) %>%
mutate(type = "conditional a"),
get_index(p_conditional_b, bias_correct = FALSE) %>%
mutate(type = "conditional b"),
p_summed$data %>%
group_by(year, depth, X, Y) %>%
reframe(est = mean(est)) %>%
group_by(year) %>%
reframe(est = sum(est)) %>%
mutate(lwr = NA, upr = NA, log_est = log(est), se = NA) %>%
mutate(type = "marginal")
) %>%
mutate(method = "predictions")
#> Bias correction is turned off.
#> It is recommended to turn this on for final inference.
#> Bias correction is turned off.
#> It is recommended to turn this on for final inference.
#> Bias correction is turned off.
#> It is recommended to turn this on for final inference.
## Fixed effect vs index ratio ####
fixef <-
lapply(seq_along(m$family$family), function(i) {
tidy(m, conf.int = TRUE, model = i) %>%
mutate(family = m$family$family[i],
link = m$family$link[i],
.before = 1)
}) %>%
bind_rows()
## Ratio between fixed effects for a and b:
(fixef %>%
filter(family == "binomial", term == "geartypeb") %>%
pull(estimate) %>%
m$family[[1]]$linkinv() *
fixef %>%
filter(family == "Gamma", term == "geartypeb") %>%
pull(estimate) %>%
m$family[[2]]$linkinv()) /
(fixef %>%
filter(family == "binomial", term == "geartypea") %>%
pull(estimate) %>%
m$family[[1]]$linkinv() *
fixef %>%
filter(family == "Gamma", term == "geartypea") %>%
pull(estimate) %>%
m$family[[2]]$linkinv())
#> [1] 0.8633095
## Ratio between index estimates for a and b:
indices %>%
filter(grepl("conditional", type)) %>%
group_by(year) %>%
reframe(ratio = est[type == "conditional b"] / est[type == "conditional a"])
#> # A tibble: 4 × 2
#> year ratio
#> <int> <dbl>
#> 1 2011 0.882
#> 2 2013 0.889
#> 3 2015 0.887
#> 4 2017 0.882
## Plot ####
p1 <- indices %>%
ggplot(
aes(x = year, y = est, ymin = lwr, ymax = upr, color = type,
fill = type)
) +
geom_ribbon(color = NA, alpha = 0.3) +
geom_path(size = 1) +
labs(color = "Index type", fill = "Index type") +
theme_classic()
#> Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
#> ℹ Please use `linewidth` instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
p2 <- bind_rows(
indices %>% filter(type %in% c("summed", "marginal")),
indices %>%
filter(grepl("conditional", type)) %>%
group_by(year) %>%
reframe(est = sum(est), lwr = sum(lwr), upr = sum(upr)) %>%
mutate(type = "summed",
method = "recalculated"),
indices %>%
filter(grepl("conditional", type)) %>%
group_by(year) %>%
reframe(est = mean(est), lwr = mean(lwr), upr = mean(upr)) %>%
mutate(type = "marginal",
method = "recalculated")
) %>%
ggplot(
aes(x = year, y = est, ymin = lwr, ymax = upr, color = method,
fill = method, linetype = method)
) +
geom_ribbon(color = NA, alpha = 0.3) +
geom_path(size = 1, alpha = 0.5) +
facet_wrap(~type, scales = "free_y") +
labs(color = "Method", fill = "Method") +
theme_classic()
cowplot::plot_grid(p1, p2, ncol = 1)
#> Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
#> -Inf
#> Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
#> -Inf Created on 2024-03-15 with reprex v2.1.0 Bias correction Unfortunately bias correction still leads to a crash in my high-resolution models on a server with 503G ram and 71 cores. As far as I understand, splitting by year is not an option, because I have gear type, season and depth as covariates in the model. These would get estimated differently for different years. Hence I'll probably need to manage without bias correction for now, but will need to get the indices bias corrected eventually. I examined the differences among the methods below. Index calculation methods Based on my experimentation with my high-resolution model, the method of choice to calculate the index leads to approximately similar central estimates (put aside the index type examined above), but the confidence intervals may differ. Based on experimentation with the example dataset supplied in sdmTMB, this is not always the case (see the example below). This shows that, as you say, one should do the indices with bias correction if possible. Currently in my case that is not possible, but I will continue experimenting. Year as fixed or spatiotemporal effect? My high-resolution model does not converge when using year as both fixed and spatiotemporal effect. This seems to be the case for the example dataset too. So, one has to choose. For my dataset, I would be inclined to use year as spatiotemporal effect, because there are large differences in, both, sampling coverage and distribution among years. I need to experiment using both options. So far, I have only used the spatiotemporal option. Maybe bias correction would work if using year as fixed effect, although I am afraid that would simplify the model too much. # Index calculation method example ####
library(sdmTMB)
library(tidyverse)
library(cowplot)
#>
#> Attaching package: 'cowplot'
#> The following object is masked from 'package:lubridate':
#>
#> stamp
## Set seed to make get_index_sims results consequent
set.seed(1)
## Model with year as spatiotemporal random field
m <- sdmTMB(
density ~ 0 + poly(log(depth), 2),
data = pcod_2011,
time = "year",
mesh = pcod_mesh_2011,
family = delta_gamma()
)
nd <- qcs_grid %>% replicate_df("year", unique(pcod_2011$year))
p <- predict(m, newdata = nd, return_tmb_object = TRUE, type = "response")
## Model with year as a fixed effect
m2 <- sdmTMB(
density ~ 0 + factor(year) + poly(log(depth), 2),
data = pcod_2011,
time = "year",
spatiotemporal = "off",
mesh = pcod_mesh_2011,
family = delta_gamma()
)
p2 <- predict(m2, newdata = nd, return_tmb_object = TRUE, type = "response")
## Compile the indices ####
indices <-
bind_rows(
bind_rows(
get_index(p, bias_correct = FALSE) %>%
mutate(type = "get_index(\nbias_correct = F)"),
get_index(p, bias_correct = TRUE) %>%
mutate(type = "get_index(\nbias_correct = T)"),
p$data %>%
group_by(year, depth, X, Y) %>%
reframe(est = mean(est)) %>%
group_by(year) %>%
reframe(est = sum(est)) %>%
mutate(lwr = NA, upr = NA, log_est = log(est), se = NA) %>%
mutate(type = "sum predict"),
get_index_sims(obj = predict(m, newdata = nd, nsim = 100)) %>%
mutate(type = "get_index_sims")
) %>%
mutate(year_effect = "spatiotemporal"),
bind_rows(
get_index(p2, bias_correct = FALSE) %>%
mutate(type = "get_index(\nbias_correct = F)"),
get_index(p2, bias_correct = TRUE) %>%
mutate(type = "get_index(\nbias_correct = T)"),
p2$data %>%
group_by(year, depth, X, Y) %>%
reframe(est = mean(est)) %>%
group_by(year) %>%
reframe(est = sum(est)) %>%
mutate(lwr = NA, upr = NA, log_est = log(est), se = NA) %>%
mutate(type = "sum predict"),
get_index_sims(obj = predict(m2, newdata = nd, nsim = 100)) %>%
mutate(type = "get_index_sims")
) %>%
mutate(year_effect = "fixed")
)
#> Bias correction is turned off.
#> It is recommended to turn this on for final inference.
#> We generally recommend using `get_index(..., bias_correct = TRUE)`
#> rather than `get_index_sims()`.
#> Bias correction is turned off.
#> It is recommended to turn this on for final inference.
#> We generally recommend using `get_index(..., bias_correct = TRUE)`
#> rather than `get_index_sims()`.
## Line plot ####
pl <- ggplot(
data = indices,
aes(x = year, y = est, ymin = lwr, ymax = upr, color = type, fill = type)
) +
geom_ribbon(color = NA, alpha = 0.3) +
geom_path(size = 1) +
geom_path(
data = indices %>%
mutate(type2 = type, year_effect2 = year_effect) %>%
dplyr::select(-type, -year_effect),
aes(x = year, y = est, color = type2, linetype = year_effect2),
inherit.aes = FALSE) +
scale_linetype_manual(values = c(2,3)) +
labs(color = "Index method", linetype = "Year effect", fill = "Index method") +
facet_grid(year_effect~type) +
theme_classic()
#> Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
#> ℹ Please use `linewidth` instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
## Point plot ####
pp1 <- indices %>%
group_by(type, year_effect) %>%
reframe(est_sd = sd(est), est = mean(est), lwr = mean(lwr), upr = mean(upr),
se = mean(se)) %>%
mutate(pr_est = 100*est/mean(est), pr_upr = 100*upr/mean(est),
pr_lwr = 100*lwr/mean(est)) %>%
ggplot(aes(type, pr_est, ymax = pr_upr, ymin = pr_lwr, color = year_effect)) +
geom_hline(yintercept = 100, color = "grey") +
geom_errorbar(width = 0.2, position = position_dodge(width = 0.3)) +
geom_point(position = position_dodge(width = 0.3)) +
labs(y = "Percentage of mean estimate", color = "Year effect", x = "Index method") +
theme_classic()
pp2 <- indices %>%
group_by(type, year_effect) %>%
reframe(est_sd = sd(est), est = mean(est), lwr = mean(lwr), upr = mean(upr),
se = mean(se)) %>%
mutate(pr_est = 100*est/mean(est), pr_upr = 100*upr/mean(est),
pr_lwr = 100*lwr/mean(est)) %>%
ggplot(aes(year_effect, pr_est, color = type)) +
geom_hline(yintercept = 100, color = "grey") +
geom_point(position = position_dodge(width = 0.1)) +
expand_limits(y = 106) +
labs(y = "Percentage of mean estimate", color = "Index method", x = "Year effect") +
theme_classic()
cowplot::plot_grid(pl, pp1, pp2, ncol = 1, labels = "auto")
#> Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
#> -Inf
#> Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
#> -Inf Created on 2024-03-15 with reprex v2.1.0 |
Beta Was this translation helpful? Give feedback.
-
A few quick answers:
That could be. That's not a general feature though. In fact, having both is probably the most common index standardization configuration. The > library(sdmTMB)
> m2 <- sdmTMB(
+ density ~ 0 + factor(year),
+ data = pcod,
+ time = "year",
+ spatiotemporal = "iid",
+ mesh = make_mesh(pcod, c("X", "Y"), cutoff = 10),
+ family = delta_gamma()
+ )
> sanity(m2)
✔ Non-linear minimizer suggests successful convergence
✔ Hessian matrix is positive definite
✔ No extreme or very small eigenvalues detected
✔ No gradients with respect to fixed effects are >= 0.001
✔ No fixed-effect standard errors are NA
✔ No standard errors look unreasonably large
✔ No sigma parameters are < 0.01
✔ No sigma parameters are > 100
✔ Range parameters don't look unreasonably large
You would fit with all your data, but in my example above you would predict and integrate over area with I haven't carefully worked through all the code above, but I don't think you would want to add the confidence intervals. You would have to add the variances/covariances from the covariance matrix to combine the uncertainty and get back to the standard error (or add them into the template and call ADREPORT on them or sample from them and add them or maybe even use tmbprofile to combine them, but that might be slow here). |
Beta Was this translation helpful? Give feedback.
Gear type would usually be considered a catchability covariate and so you would predict for a single reference gear type. As long as gear type is not in the model as a spatially varying coefficient or interacting with some variable that is changing through time, the indexes that you get out of the various gear types should be the same, just shifted up or down for catchability. If the index was being used in a stock assessment, then typically you would either consider it a relative index and estimate catchability (in which case which gear type you pick shouldn't matter unless you are putting a prior on it) or you would treat it as an absolute index, in which case I assume you would want it…