Skip to content

DataCanvasIO/Cooka

Repository files navigation

Python Versions Downloads PyPI Version

Doc | 简体中文

Cooka is a lightweight and visualization toolkit to manage datasets and design model learning experiments through web UI. It's using DeepTables and HyperGBM as experiment engine to complete feature engineering, neural architecture search and hyperparameter tuning automatically.

DataCanvas AutoML Toolkit

Features overview

Through the web UI provided by cooka you can:

  • Add and analyze datasets
  • Design experiment
  • View experiment process and result
  • Using models
  • Export experiment to jupyter notebook

Screen shots:

The machine learning algorithms supported are :

  • XGBoost
  • LightGBM
  • Catboost

The neural networks supported are:

  • WideDeep
  • DeepFM
  • xDeepFM
  • AutoInt
  • DCN
  • FGCNN
  • FiBiNet
  • PNN
  • AFM
  • ...

The search algorithms supported are:

  • Evolution
  • MCTS(Monte Carlo Tree Search)
  • ...

The supported feature engineering provided by scikit-learn and featuretools are:

  • Scaler

    • StandardScaler
    • MinMaxScaler
    • RobustScaler
    • MaxAbsScaler
    • Normalizer
  • Encoder

    • LabelEncoder
    • OneHotEncoder
    • OrdinalEncoder
  • Discretizer

    • KBinsDiscretizer
    • Binarizer
  • Dimension Reduction

    • PCA
  • Feature derivation

    • featuretools
  • Missing value filling

    • SimpleImputer

It can also extend the search space to support more feature engineering methods and modeling algorithms.

Installation

Using pip

The python version should be >= 3.6, for CentOS , install the system package:

pip install --upgrade pip
pip install cooka

Start the web server:

cooka server

Then open http://<your_ip:8000> with your browser to use cooka.

By default, the cooka configuration file is at ~/.config/cooka/cooka.py, to generate a template:

mkdir -p ~/.config/cooka/
cooka generate-config > ~/.config/cooka/cooka.py

Using Docker

Launch a Cooka docker container:

docker run -ti -p 8888:8888 -p 8000:8000 -p 9001:9001 -e COOKA_NOTEBOOK_PORTAL=http://<your_ip>:8888 datacanvas/cooka:latest

Open http://<your_ip:8000> with your browser to visit cooka.

Citation

If you use Cooka in your research, please cite us as follows:

Haifeng Wu, Jian Yang. Cooka: A lightweight and visual AutoML system. https://github.com/DataCanvasIO/Cooka, 2021. Version 0.1.x

@misc{cooka,
  author={Haifeng Wu, Jian Yang},
  title={{Cooka}: {A lightweight and visual AutoML system}},
  howpublished={https://github.com/DataCanvasIO/Cooka},
  note={Version 0.1.x},
  year={2021}
}

DataCanvas

Cooka is an open source project created by DataCanvas.