Bringing Scikit-learn decision trees to Excel
With this Python package, one can make a trained machine learning model accessible to others without having to deploy it as a service. More specifically, one can export a Scikit-learn decision tree or random forest model to a Excel workbook. All decision chains in the model will be represented within a single table and feature values can be tested for an average prediction.
Version: 0.1.1
- package level
- export_to_xlsx() (main access point)
- export_to_textfile() (alternative use)
- detects maximum tree depth and applies this parameter
- helpers module
- create_xlfile (project internal)
- writes a DecisionTreeTable object to a Excel sheet
- writes features and an initial value of 1 to front sheet
- writes decision trees to 2nd sheet
- create_xlfile (project internal)
- core module
- class DecisionTreeTable (project internal)
- a class that can be instantiated with a parsed text file
- transforms and represent decisions trees in a datastructure
- exposed properties to access info about the structure
- exposed methods to get tests and results as indexed rows
- handle classifier- and regressor-type decision trees
- class DecisionTreeTable (project internal)
- TODO:
- thoroughly testing (75%)
pip install sklearn2excelInstallation will install scikit-learn and XlsxWriter as well.
from pathlib import Path
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import LabelEncoder
import sklear2excel as s2e
# fetch Scikit-learn wine example data as
# sklearn.utils.Bunch object
# and prepare example model from
# sklearn.ensemble.RandomForestClassifier
# RandomForestRegressor or any classifier/regressor
# subtype of BaseDecisionTree could be used
bunch = s2e.get_data_target_and_features()
wine_data = bunch.data
wine_target = bunch.target
wine_features = bunch.feature_names[:4]
X = wine_data[wine_features]
y = LabelEncoder().fit_transform(wine_target)
clf_model = RandomForestClassifier(
n_estimators=10,
min_samples_leaf=2
).fit(X, y)
path_xlsx = Path.cwd() / "excel_output.xlsx"
path_txt = Path.cwd() / "text_output.txt"
# export model as text file with use of
# sklearn export function
# first param single or ensemble of decision trees
s2e.export_to_textfile(
clf_model.estimators_, # ensemble of decision trees
path_txt,
wine_features
)
# export model as Excel file
# features written to Front sheet with initial value 1.0
# decision trees written to 2nd sheet
s2e.export_to_xlsx(
clf_model.estimators_,
wine_features,
path_xlsx
)- Flit ~3.4
- 0.1.1
- FIX: XlsxWriter dependency corrected
- 0.1.0
- First proper release
- NEW: direct function
export_to_xlsx() - CHANGE: functions and class available at package-level
- 0.0.1
- Work in progress
Torbjørn Wikestad – @TWikestad – torbjorn.wikestad@gmail.com
Distributed under the MIT license. See LICENSE for more information.
https://github.com/tobisan5/github-link
- Fork it (https://github.com/tobisan5/sklearn2excel/fork)
- Create your feature branch (
git checkout -b feature/fooBar) - Commit your changes (
git commit -am 'Add some fooBar') - Push to the branch (
git push origin feature/fooBar) - Create a new Pull Request