Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft pool_means() and pool_comparisons() #378

Merged
merged 26 commits into from
Feb 10, 2025
Merged

Draft pool_means() and pool_comparisons() #378

merged 26 commits into from
Feb 10, 2025

Conversation

strengejacke
Copy link
Member

@strengejacke strengejacke commented Feb 5, 2025

Fixes #151

@DominiqueMakowski WDYT? pool-function when dealing with missing data. Suggestions for other names, or even one function name (where inside the function, based on the class, automatically deals with pooled means or contrasts)

estimate_means

library(modelbased)
data("nhanes2", package = "mice")

# regular
m <- lm(bmi ~ age + hyp + chl, data = nhanes2)
estimate_means(m, "age")
#> Estimated Marginal Means
#> 
#> age   |  Mean |   SE |         95% CI |  t(8)
#> ---------------------------------------------
#> 20-39 | 31.44 | 1.85 | [27.17, 35.71] | 16.98
#> 40-59 | 24.82 | 1.50 | [21.36, 28.28] | 16.55
#> 60-99 | 20.26 | 2.62 | [14.21, 26.31] |  7.72
#> 
#> Variable predicted: bmi
#> Predictors modulated: age
#> Predictors averaged: hyp, chl (1.9e+02)

# imputed and pooled
imp <- mice::mice(nhanes2, printFlag = FALSE)
predictions <- lapply(1:5, function(i) {
  m <- lm(bmi ~ age + hyp + chl, data = mice::complete(imp, action = i))
  estimate_means(m, "age")
})
pool_means(predictions)
#> Estimated Marginal Means
#> 
#> age   |  Mean |   SE |         95% CI | t(20)
#> ---------------------------------------------
#> 20-39 | 30.69 | 1.30 | [14.12, 47.26] | 23.54
#> 40-59 | 24.74 | 1.42 | [ 6.65, 42.82] | 17.31
#> 60-99 | 23.35 | 1.68 | [ 1.97, 44.74] | 14.98
#> 
#> Variable predicted: bmi
#> Predictors modulated: age
#> Predictors averaged: hyp, chl (1.9e+02)

estimate_contrasts

library(modelbased)
data("nhanes2", package = "mice")

# regular
m <- lm(bmi ~ age + hyp + chl, data = nhanes2)
estimate_contrasts(m, "age")
#> Marginal Contrasts Analysis
#> 
#> Level1 | Level2 | Difference |   SE |          95% CI |  t(8) |     p
#> ---------------------------------------------------------------------
#> 40-59  | 20-39  |      -6.62 | 2.30 | [-11.93, -1.31] | -2.88 | 0.021
#> 60-99  | 20-39  |     -11.18 | 3.36 | [-18.93, -3.43] | -3.33 | 0.010
#> 60-99  | 40-59  |      -4.56 | 2.89 | [-11.22,  2.10] | -1.58 | 0.153
#> 
#> Variable predicted: bmi
#> Predictors contrasted: age
#> Predictors averaged: hyp, chl (1.9e+02)
#> p-values are uncorrected.

# imputed and pooled
imp <- mice::mice(nhanes2, printFlag = FALSE)
predictions <- lapply(1:5, function(i) {
  m <- lm(bmi ~ age + hyp + chl, data = mice::complete(imp, action = i))
  estimate_contrasts(m, "age")
})
pool_contrasts(predictions)
#> Marginal Contrasts Analysis
#> 
#> Level1 | Level2 | Difference |   SE |          95% CI | t(20) |      p
#> ----------------------------------------------------------------------
#> 40-59  | 20-39  |      -5.22 | 2.02 | [ -9.43, -1.01] | -2.00 |  0.059
#> 60-99  | 20-39  |      -7.49 | 2.49 | [-12.67, -2.30] | -4.33 | < .001
#> 60-99  | 40-59  |      -2.27 | 3.00 | [ -8.51,  3.98] | -2.35 |  0.029
#> 
#> Variable predicted: bmi
#> Predictors contrasted: age
#> Predictors averaged: hyp, chl (2e+02)
#> p-values are uncorrected.

Created on 2025-02-05 with reprex v2.1.1

@strengejacke strengejacke marked this pull request as ready for review February 5, 2025 19:14
@strengejacke
Copy link
Member Author

Now also works for the prediction functions:

library(modelbased)
data("nhanes2", package = "mice")

# regular
m <- lm(bmi ~ age + hyp + chl, data = nhanes2)
estimate_expectation(m, by = "age")
#> Model-based Predictions
#> 
#> age   | Predicted |   SE |         95% CI
#> -----------------------------------------
#> 20-39 |     30.26 | 1.47 | [26.87, 33.65]
#> 40-59 |     23.64 | 1.75 | [19.59, 27.69]
#> 60-99 |     19.08 | 2.80 | [12.63, 25.53]
#> 
#> Variable predicted: bmi
#> Predictors modulated: age
#> Predictors controlled: hyp (no), chl (1.9e+02)

# imputed and pooled
imp <- mice::mice(nhanes2, printFlag = FALSE)
predictions <- lapply(1:5, function(i) {
  m <- lm(bmi ~ age + hyp + chl, data = mice::complete(imp, action = i))
  estimate_expectation(m, by = "age")
})
pool_means(predictions)
#> Model-based Predictions
#> 
#> age   | Predicted |   SE |         95% CI
#> -----------------------------------------
#> 20-39 |     29.01 | 1.65 | [25.78, 32.24]
#> 40-59 |     23.34 | 1.54 | [20.31, 26.36]
#> 60-99 |     23.13 | 2.11 | [18.99, 27.26]
#> 
#> Variable predicted: bmi
#> Predictors modulated: age
#> Predictors controlled: hyp (no), chl (1.9e+02)

@DominiqueMakowski WDYT? I think this is quite useful, since imputing missing data is increasingly common, as well as using marginal means/adjusted predictions, and now being able to combine both is really powerful.

@strengejacke
Copy link
Member Author

I'm not sure pool_contrasts(), and especially pool_means(), are good names. Maybe a better name for pool_means()?

@strengejacke
Copy link
Member Author

Renamed pool_means() into pool_predictions()

@strengejacke strengejacke merged commit f5396b2 into main Feb 10, 2025
20 of 25 checks passed
@strengejacke strengejacke deleted the pool_functions branch February 10, 2025 07:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

visualisation_recipe with transformed responses
1 participant