Skip to content
This repository was archived by the owner on Nov 5, 2022. It is now read-only.

ML Pipeline Generator is a tool for generating end-to-end pipelines composed of GCP components so that any customer can easily migrate their local ML models onto GCP and start realizing the benefits of the cloud quickly.

License

Notifications You must be signed in to change notification settings

GoogleCloudPlatform/ml-pipeline-generator-python

Repository files navigation

AI Pipelines

AI Pipelines is a tool for generating end-to-end pipelines composed of GCP components so that any customer can easily migrate their local ML models onto GCP and start realizing the benefits of the cloud quickly. Currently ML pipelines are very difficult to implement for customers, and take weeks if not months with experienced Googlers.

The following ML frameworks will be supported:

  1. Tensorflow (TF)
  2. Scikit-learn (SKL)
  3. XGBoost (XGB)

We will first only consider Kubeflow Pipelines (KFP) for orchestrating ML pipelines built using various Cloud AI Platform (CAIP) features. Orchestration using Cloud Composer (CC) may be in scope in the future.

The full project plan can be found here.

Setup

GCP credentials

gcloud auth login
gcloud auth application-default login
gcloud config set project [PROJECT_ID]

Python environment

python3 -m venv venv
source ./venv/bin/activate
pip install -r requirements.txt

Config file

Update the information in config.yaml.

CAIP Demo

This demo uses the scikit-learn model in examples/sklearn/user_model.py to create a training module to run on CAIP.

python -m examples.sklearn.demo

Running this demo uses the config file to generate bin/run.train.sh along with trainer/ code. Then, run bin/run.train.sh to train locally or bin/run.train.sh cloud to train on CAIP.

KFP Demo

This demo uses the scikit-learn model in examples/sklearn/user_model.py to create KubeFlow Pipeline.

python -m examples.kfp.demo
python -m orchestration.pipeline

This compiles a pipeline which can be uploaded to KubeFlow.

Cleanup

Delete the generated files by running bin/cleanup.sh.

Tests

The tests use unittest, Python's built-in unit testing framework. By running python -m unittest, the framework performs test discovery to find all tests within this project. Tests can be run on a more granular level by feeding a directory to test discover. Read more about unittest here.

python -m unittest

About

ML Pipeline Generator is a tool for generating end-to-end pipelines composed of GCP components so that any customer can easily migrate their local ML models onto GCP and start realizing the benefits of the cloud quickly.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published