Skip to content

Vekteur/multi-output-conformal-regression

Repository files navigation

This is the repository associated with the paper A Unified Comparative Study with Generalized Conformity Scores for Multi-Output Conformal Regression.

It includes:

  • An implementation of several conformal methods for multi-output conformal regression.
  • Several base predictors (Multivariate Quantile Function Forecaster, Distributional Random Forest, Gaussian Mixture parametrized by a hypernetwork).
  • Metrics for marginal coverage, region size, and conditional coverage.
  • A large empirical study based on datasets gathered from the literature, all with multiple outputs.

Datasets

All datasets except MEPS are directly available in this repository. See step 5 of the installation for downloading MEPS.

Refer to these repositories for more information on the datasets used in this study:

Example usage

The following code shows an example usage of the code in this repository.

from moc.configs.config import get_config
from moc.utils.run_config import RunConfig
from moc.models.mqf2.lightning_module import MQF2LightningModule
from moc.models.trainers.lightning_trainer import get_lightning_trainer
from moc.datamodules.real_datamodule import RealDataModule
from moc.metrics.metrics_computer import compute_coverage_indicator, compute_log_region_size
from moc.conformal.conformalizers import L_CP


config = get_config()
config.device = 'cpu'
rc = RunConfig(config, 'mulan', 'sf2')
datamodule = RealDataModule(rc)
p, q = datamodule.input_dim, datamodule.output_dim
model = MQF2LightningModule(p, q)
trainer = get_lightning_trainer(rc)
trainer.fit(model, datamodule)

alpha = 0.1
conformalizer = L_CP(datamodule.calib_dataloader(), model)
test_batch = next(iter(datamodule.test_dataloader()))
x, y = test_batch
coverage = compute_coverage_indicator(conformalizer, alpha, x, y)
volume = compute_log_region_size(conformalizer, model, alpha, x, n_samples=100)
print(coverage)
print(volume)

Installation

Prerequisites

  • Python (tested on 3.10.14)

Steps

  1. Clone the repository:
git clone https://github.com/Vekteur/multi-output-conformal-regression.git
cd multi-output-conformal-regression
  1. (Optional) Create and activate a Python virtual environment:
python -m venv venv
source venv/bin/activate
  1. Install Python dependencies:
pip install -r requirements.txt

for exact versions ensuring reproducibility, or

pip install -r requirements.in

for more flexibility.

  1. (Optional) If you want to run Distributional Random Forests, install R (tested on 4.4.1, version 4.1 or higher is required). Open the R interpreter using the command R and run the following command:
install.packages("drf")

Compilation will take a few minutes. Then run

pip install --index-url https://test.pypi.org/simple/ drf==0.1
  1. (Optional) For running experiments on the MEPS dataset, download it according to these instructions, summarized below:
git clone https://github.com/yromano/cqr
cd cqr/get_meps_data/
Rscript download_data.R
python main_clean_and_save_to_csv.py
cd ../../
for id in 19 20 21; do mv "cqr/get_meps_data/meps_${id}_reg.csv" "data/feldman/meps_${id}.csv"; done
rm -rf cqr

Reproducing the results

To generate the figures for toy datasets, run toy_experiments.ipynb.

To compute the results of the paper:

python run.py name="full" device="cuda" repeat_tuning=10

or use device="cpu" if you don't have a GPU.

To generate the figures based on these results, run analysis.ipynb

About

Implementation of multi-output conformal regression methods

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published