This repository holds the implementation for an additive regressor, i.e.:
Where
The functions
This is achieved by combining the boosting algorithm with a modification of mRMR (minimum Redundancy Maximum Relevance) feature selection at each boosting iteration.
The package can be installed with pip:
pip install git+https://github.com/thesis-jdgs/additive-sparse-boost-regression.git
Note: The package is not yet available on PyPI.
The regressor is implemented in asboostreg.py
,
and implements the familiar fit
and predict
methods from scikit-learn.:
from asboostreg import SparseAdditiveBoostingRegressor
from sklearn.datasets import load_boston
X, y = load_boston(return_X_y=True)
sparsereg = SparseAdditiveBoostingRegressor(
learning_rate=0.01,
n_estimators=10_000,
l2_regularization=2.0,
max_depth=6,
row_subsample=0.632,
random_state=0,
n_iter_no_change=30,
)
sparsereg.fit(X, y)
y_pred = sparsereg.predict(X)
To inspect the general characteristics of the model,
the plot_model_information
method creates a plotly figure:
sparsereg.plot_model_information()
Which creates a figure of the iteration history and model complexity for each feature, like the following:
To inspect the predictions for a dataset X
, you can use explain(X)
:
sparsereg.explain(X)
Which creates a figure of the mean importances of each feature, and a plot of the 1D regressor for each selected feature:
We can also decompose the predictions into the additive components,
with the contribution_frame
method:
sparsereg.contribution_frame(X)
Which returns a pandas DataFrame with the additive components for each feature.