Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow draw() to plot on the response scale #79

Open
2 of 4 tasks
gavinsimpson opened this issue Jul 1, 2020 · 7 comments
Open
2 of 4 tasks

Allow draw() to plot on the response scale #79

gavinsimpson opened this issue Jul 1, 2020 · 7 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@gavinsimpson
Copy link
Owner

gavinsimpson commented Jul 1, 2020

Implement the equivalent of the trans and shift arguments that mgcv:::plot.gam() has so that users can pass in a function (e.g. the inverse link function) and a constant (e.g. the intercept), to get plots on the response scale.

Should these be arguments to the evaluate_smooth() methods or something draw() does internally?

Either way, the confidence bands aren't going to include the uncertainty in the constant, so shift is purely to move the origin (y-axis) around. This is done in draw() and the various methods.

Want to have a simpler way for the user to express that they want the plots on the response scale. So need a new argument (response_scale?) that if set to TRUE, automatically set constant and fun to the correct values/functions such that we get values and intervals on the response scale. This will override anything passed to constant and fun obviously.

Should we include the overall uncertainty as we're bringing in the model constant here?

What about factor by smooths? These should really have the constant plus the parametric term for their level added as a constant.

Checklist:

@gavinsimpson gavinsimpson added the enhancement New feature or request label Jul 1, 2020
@gavinsimpson gavinsimpson self-assigned this Jul 1, 2020
gavinsimpson added a commit that referenced this issue Sep 21, 2020
gavinsimpson added a commit that referenced this issue Sep 21, 2020
  shortcut of `response = TRUE` or similar to make this easier to use.
@gavinsimpson gavinsimpson added this to the 0.5 milestone Sep 21, 2020
@gavinsimpson gavinsimpson modified the milestones: 0.5, 0.6 Jan 23, 2021
@gavinsimpson gavinsimpson modified the milestones: 0.6, 0.7 Jun 25, 2021
@gavinsimpson gavinsimpson modified the milestones: 0.7, 0.8 Feb 19, 2022
@barbmuhling
Copy link

Hello! Are there any plans to incorporate this option into smooth_estimates() any time soon? That would be super useful...

@gavinsimpson
Copy link
Owner Author

@barbmuhling the draw() method for smooth_estimates() has this (at least in the development version here on GH). If you mean to modify the values returned by smooth_estimates(), I think I won't be adding support for that as you can already do what you want using transform_fun() and a bit of dplyr fu:

library("mgcv")
library("gratia")
library("dplyr")

dat <- data_sim("eg1", n = 400, dist = "normal", scale = 2, seed = 2)
m1 <- gam(y ~ s(x0) + s(x1) + s(x2) + s(x3), data = dat, method = "REML")

my_const <- coef(m1)[1]

smooth_estimates(m1) |>
  add_confint() |>
  mutate(across(tidyselect::all_of(c("est", "lower_ci", "upper_ci")),
                .fns = \(x) x + my_const)) |>
  transform_fun(fun = \(x) x^2)

to apply any function you want to the estimate and credible interval. (This is what is used internally by the draw() methods.)

I could add something to transform_fun() to allow you to pass the constant in as well, to avoid the mutate() step?

@gavinsimpson gavinsimpson modified the milestones: 0.8, 0.9 Jan 26, 2023
@barbmuhling
Copy link

Ah, this will work perfectly, thanks so much!

@tkiff
Copy link

tkiff commented Dec 11, 2023

Hello Gavin, I came across this thread looking for help plotting by factors on top of each other. Apologies for the clunky code, but would this be the appropriate method?

Please let me know if this should go on stack overflow instread!

library("mgcv")
library("gratia")
library("dplyr")

dat <- data_sim("eg4", seed = 42)
m1 <- gam(y ~  fac + s(x2, by = fac), data = dat, method = "REML")

my_const <- coef(m1)[1]
my_constFac2 <- coef(m1)[2]
my_constFac3 <- coef(m1)[3]

parametric_effects(m1)

s2fac1 = smooth_estimates(m1) |>
  filter(smooth == "s(x2):fac1") |>
  add_confint() |>
  mutate(across(tidyselect::all_of(c("est", "lower_ci", "upper_ci")),
                .fns = \(x) x + my_const))

s2fac2 = smooth_estimates(m1) |>
  filter(smooth == "s(x2):fac2") |>
  add_confint() |>
  mutate(across(tidyselect::all_of(c("est", "lower_ci", "upper_ci")),
                .fns = \(x) x + my_const + my_constFac2))

s2fac3 =smooth_estimates(m1) |>
  filter(smooth == "s(x2):fac3") |>
  add_confint() |>
  mutate(across(tidyselect::all_of(c("est", "lower_ci", "upper_ci")),
                .fns = \(x) x + my_const + my_constFac3))

ggplot() +
  # Fac 1
  geom_line(data = s2fac1, aes(x = x2, est), color = "red") +
  geom_ribbon(data = s2fac1, aes(ymin = lower_ci, ymax = upper_ci, x = x2),
              alpha = 0.3, fill = "red") +
  # Fac 2
  geom_line(data = s2fac2, aes(x = x2, est), color = "black") +
  geom_ribbon(data = s2fac2, aes(ymin = lower_ci, ymax = upper_ci, x = x2),
              alpha = 0.3, fill = "black") +
  # Fac 3
  geom_line(data = s2fac3, aes(x = x2, est), color = "blue") +
  geom_ribbon(data = s2fac3, aes(ymin = lower_ci, ymax = upper_ci, x = x2),
              alpha = 0.3, fill = "blue")

@gavinsimpson
Copy link
Owner Author

gavinsimpson commented Dec 12, 2023

@tkiff That works, but only for simple examples like this with a single factor.

I think you're better off using data_slice() to create the data for covariates that you want to evaluate the model at, including the values of other covariates in the model, and then use fitted_values() to get the data you need to plot:

library("mgcv")
library("gratia")
library("dplyr")
library("ggplot2")

dat <- data_sim("eg4", seed = 42)
m1 <- gam(y ~  fac + s(x2, by = fac) + s(x0) + s(x1), data = dat, method = "REML")

ds <- data_slice(m1, x2 = evenly(x2), fac = evenly(fac))
fv <- fitted_values(m1, data = ds)

fv |>
  ggplot(aes(x = x2, y = .fitted, group = fac)) +
  geom_ribbon(aes(ymin = .lower_ci, ymax = .upper_ci, fill = fac), alpha = 0.2) +
  geom_line(aes(colour = fac))

gratia-issue-79-plt1

Notice that data_slice() provides values for any covariates not specified but which are in the model. This forces you to be explicit about what you are conditioning on.

In this case, you can exclude the effects of the other smooths when generating fitted values

fv2 <- fitted_values(m1, data = ds, exclude = c("s(x0)", "s(x1)"))

fv2 |>
  ggplot(aes(x = x2, y = .fitted, group = fac)) +
  geom_ribbon(aes(ymin = .lower_ci, ymax = .upper_ci, fill = fac), alpha = 0.2) +
  geom_line(aes(colour = fac))

gratia-issue-79-plt2

There's no difference here because of how data_slice() finds typical values for the unmentioned coavriates.

@gavinsimpson
Copy link
Owner Author

@tkiff

Please let me know if this should go on stack overflow instread!

If you have questions about gratia you could ask them on SO but better might be to use the Discussion here on GitHub: https://github.com/gavinsimpson/gratia/discussions

@gavinsimpson gavinsimpson modified the milestones: 0.9, 0.10 Mar 11, 2024
@gavinsimpson
Copy link
Owner Author

This is essentially complete except for the two stretch goals

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants