Skip to content
/ doubt Public
forked from saattrupdan/doubt

Bringing back uncertainty to machine learning.

License

Notifications You must be signed in to change notification settings

pavelzw/doubt

 
 

Repository files navigation

Doubt

Bringing back uncertainty to machine learning.


PyPI Status Documentation License LastCommit Code Coverage Conference

A Python package to include prediction intervals in the predictions of machine learning models, to quantify their uncertainty.

Installation

If you do not already have HDF5 installed, then start by installing that. On MacOS this can be done using sudo port install hdf5 after MacPorts have been installed. On Ubuntu you can get HDF5 with sudo apt-get install python-dev python3-dev libhdf5-serial-dev. After that, you can install doubt with pip:

pip install doubt

Features

  • Bootstrap wrapper for all Scikit-Learn models
    • Can also be used to calculate usual bootstrapped statistics of a dataset
  • Quantile Regression for all generalised linear models
  • Quantile Regression Forests
  • A uniform dataset API, with 24 regression datasets and counting

Quick Start

If you already have a model in Scikit-Learn, then you can simply wrap it in a Boot to enable predicting with prediction intervals:

>>> from sklearn.linear_model import LinearRegression
>>> from doubt import Boot
>>> from doubt.datasets import PowerPlant
>>>
>>> X, y = PowerPlant().split()
>>> clf = Boot(LinearRegression())
>>> clf = clf.fit(X, y)
>>> clf.predict([10, 30, 1000, 50], uncertainty=0.05)
(481.9203102126274, array([473.43314309, 490.0313962 ]))

Alternatively, you can use one of the standalone models with uncertainty outputs. For instance, a QuantileRegressionForest:

>>> from doubt import QuantileRegressionForest as QRF
>>> from doubt.datasets import Concrete
>>> import numpy as np
>>>
>>> X, y = Concrete().split()
>>> clf = QRF(max_leaf_nodes=8)
>>> clf.fit(X, y)
>>> clf.predict(np.ones(8), uncertainty=0.25)
(16.933590347847982, array([ 8.93456428, 26.0664534 ]))

Citation

@inproceedings{mougannielsen2023monitoring,
  title={Monitoring Model Deterioration with Explainable Uncertainty Estimation via Non-parametric Bootstrap},
  author={Mougan, Carlos and Nielsen, Dan Saattrup},
  booktitle={AAAI Conference on Artificial Intelligence},
  year={2023}
}

About

Bringing back uncertainty to machine learning.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 97.9%
  • Makefile 2.1%