Skip to content

Official repository of yaml-ML Python package.

License

Notifications You must be signed in to change notification settings

GFaure9/yaml-ML

Repository files navigation


Your whole ML pipeline in one YAML file!

GitHub Repo stars PyPI Downloads Development Stage

yaml_ml streamlines machine learning workflows by letting you define data preprocessing, model training, and evaluation in one YAML file. Automate your ML pipeline with minimal code.

Important

Disclaimer: this is the very first version of the package. It is still under development. Use it at your own risk.

Table of Contents

Quickstart

  1. Installation
  2. Usage
  3. Docs

Usage Example

Dependencies

Tests

About the framework

⏩ Quickstart

1. Installation

Create a virtual environment (e.g. with conda), activate it and upgrade pip:

conda create --name yaml_ml_env python=3.11
conda activate yaml_ml_env
pip install --upgrade pip

Then install the package:

pip install yaml-ml

2. Usage

With one configuration file

First, create a YAML configuration file: see docs. Then, after having activated the environment where yaml_ml is installed, run the command:

python -m yaml_ml --cfg path/to/your/config/yaml/file

With multiple configuration files

In case you want to test different configurations, create corresponding YAML files and put them in a unique folder. To launch all the corresponding pipelines in parallel using multiprocessing with N worker processes, run the command:

python -m yaml_ml --cfg path/to/your/configs/folder --n_processes N 

Note

Without providing the --n_processes argument, pipelines will be launched sequentially.

3. Docs

Some guidelines about how to define a configuration file are given in the Configuration File Documentation.

All available options are consolidated in the Modules File.

You can also find examples of yaml_ml configuration files in the Examples Folder and a template file template_cfg.yaml.

📖 Usage Example: Step-by-Step

Check out explanations of a complete usage example here.


🔗 Dependencies

yaml_ml is mainly based on Scikit-learn tools: https://scikit-learn.org/stable/.

By default, installing yaml_ml will also install:

If you do not want to use them, you can install yaml_ml from sources after commenting requirements.txt lines corresponding to these libraries. To do so, first clone the repo:

git clone https://github.com/GFaure9/yaml-ML.git

Then comment unwanted packages in the requirements file and run in your virtual environment:

cd ./yaml-ML
pip install -e .

✅ Tests

If you cloned the repo and installed the package from sources (pip install -e .), you can make sure everything works fine before using it by running:

cd ./tests
python test_yaml_ml.py

At the end, you should get something like:

Ran 4 tests in 120.840s

OK

🧩 About the yaml_ml framework...

yaml_ml was designed with a modular architecture, with the aim of facilitating the integration of new models and data preprocessing techniques as needed. So do not hesitate to fork the project and extend the list of available ML models or preprocessing methods by "plugging" your favorite ones following the package's architecture.


Latest Release

Packages

No packages published

Languages