calibratedDML implements calibrated doubly robust estimators for causal
inference with categorical treatments. The package targets:
- mean potential outcomes,
E[Y(a)] - treatment-versus-control contrasts,
E[Y(a)] - E[Y(control)] - workflows that either fit nuisance models internally or start from cross-fitted nuisance estimates
Python and R interfaces are both included in this repository.
- Website: larsvanderlaan.github.io/calibratedDML
- Main paper: Doubly robust inference via calibration
- Companion package: ppi_aipw
The repository contains both the Python and R packages, plus tutorials, docs, and paper-reproduction material.
src/calibrateddml/: Python package sourcesrc/calibratedDML.py: source-tree compatibility shim for legacy codeR/,man/,vignettes/: R package source, reference docs, and R vignettesPython/tutorials/: Python tutorial notebooks and script mirrorsdocs/: built website pages for the project siteexamples/: small runnable examplestests/: Python and R package testsvalidation/: focused coverage and validation scripts for inference behaviorpaper_experiment_scripts/,paper_experiment_results/,paper_data/: paper reproduction code, outputs, and datasets
For Python package development, use src/ as the source of truth. The
top-level Python/ directory is for tutorial material and legacy compatibility
helpers, not the main packaged implementation.
Install from PyPI:
pip install calibratedDMLOptional extras:
pip install 'calibratedDML[gam]'
pip install 'calibratedDML[boosted]'
pip install 'calibratedDML[dev]'The package name on PyPI is calibratedDML. For imports, use:
import calibrateddmlInstall from GitHub:
remotes::install_github("Larsvanderlaan/calibratedDML")The R package can work with built-in learners and can also integrate with
sl3 or SuperLearner when those packages are available.
from calibrateddml import CalibratedDML
fit = CalibratedDML(
control_level=0,
outcome_model="lasso",
treatment_model="lasso",
calibration_method="auto",
random_state=123,
)
fit.fit(X, A, y)
fit.summary()
fit.confint()Main Python entry points:
CalibratedDML.fit(X, A, y, sample_weight=None)CalibratedDML.fit_from_nuisances(A, y, mu_mat, pi_mat, sample_weight=None, treatment_levels=None)
Common result accessors:
summary()to_frame()confint()
Built-in Python model names:
meanlinearlassorandom_forestgamboosted_treesauto
Core installs support mean, linear, lasso, and random_forest. The
gam option requires pygam, and boosted_trees requires lightgbm.
library(calibratedDML)
fit <- calibrated_dml(
data = df,
outcome = "Y",
treatment = "A",
covariates = c("W1", "W2", "W3"),
control_level = 0,
outcome_model = "lasso",
treatment_model = "lasso",
calibration_method = "auto"
)
summary(fit)
confint(fit)Main R entry points:
calibrated_dml(...)calibrated_dml_from_nuisances(...)
The R interface supports the same standard estimator class as Python, including
multi-arm treatment, direct nuisance input, and wald, bootstrap, and
jackknife inference.
Both interfaces support direct nuisance input.
mu_matshould contain one column per treatment level forE[Y | A = a, W]pi_matshould contain one column per treatment level forP(A = a | W)- nuisance estimates should usually be cross-fitted
Calibration sits between nuisance estimation and debiasing.
Standard calibrated DML supports:
calibration_method = "auto"calibration_method = "isotonic"calibration_method = "smooth_isotonic"calibration_method = "none"
Inference options:
inference = "jackknife"withjackknife_folds = 100is the default for standard calibrated DMLinference = "wald"inference = "bootstrap"
Practical guidance:
- Use the default jackknife intervals for standard calibrated DML.
- Use Wald when both nuisance estimators are consistent, even if one converges arbitrarily slowly.
- Use bootstrap when you want another valid resampling interval and can afford the extra computation.
The repository also includes adaptive binary-treatment estimators through:
- Python:
AdaptiveCalibratedDML - R:
adaptive_calibrated_dml()
Adaptive methods should be treated as experimental. They target the ATE through a learned and calibrated treatment-effect summary and have a narrower, more delicate inferential scope than standard calibrated DML. Adaptive estimation always uses isotonic calibration internally.
Documented adaptive modes:
mode = "calibrated_rlearner"mode = "plugin"
For most users, CalibratedDML and calibrated_dml() remain the default
entry points.
Current release posture:
- Python package version:
0.1.0 - R package version:
0.1.0 - standard calibrated DML is the primary supported workflow
- adaptive binary-treatment methods are experimental