ML Pipeline Development Guideline

This document describes the development guideline to contribute to ML pipeline project. Please check the main page for instruction on how to deploy a ML pipeline system.

ML pipeline deployment

The Pipeline system is included in kubeflow. See Getting Started Guide for how to deploy with Kubeflow.

Build Image

GKE

To be able to use GKE, the Docker images need to be uploaded to a public Docker repository, such as GCR

To build the API server image and upload it to GCR:

# Run in the repository root directory
$ docker build -t gcr.io/<your-gcp-project>/api-server:latest -f backend/Dockerfile .
# Push to GCR
$ gcloud auth configure-docker
$ docker push gcr.io/<your-gcp-project>/api-server:latest

To build the scheduled workflow controller image and upload it to GCR:

# Run in the repository root directory
$ docker build -t gcr.io/<your-gcp-project>/scheduledworkflow:latest -f backend/Dockerfile.scheduledworkflow .
# Push to GCR
$ gcloud auth configure-docker
$ docker push gcr.io/<your-gcp-project>/scheduledworkflow:latest

To build the viewer CRD controller image and upload it to GCR:

# Run in the repository root directory
$ docker build -t gcr.io/<your-gcp-project>/viewer-crd-controller:latest -f backend/Dockerfile.viewercontroller .
# Push to GCR
$ gcloud auth configure-docker
$ docker push gcr.io/<your-gcp-project>/viewer-crd-controller:latest

To build the persistence agent image and upload it to GCR:

# Run in the repository root directory
$ docker build -t gcr.io/<your-gcp-project>/persistenceagent:latest -f backend/Dockerfile.persistenceagent .
# Push to GCR
$ gcloud auth configure-docker
$ docker push gcr.io/<your-gcp-project>/persistenceagent:latest

To build the frontend image and upload it to GCR:

# Run in the repository root directory
$ docker build -t gcr.io/<your-gcp-project>/frontend:latest -f frontend/Dockerfile .
# Push to GCR
$ gcloud auth configure-docker
$ docker push gcr.io/<your-gcp-project>/frontend:latest

Minikube

Minikube can pick your local Docker image so you don't need to upload to remote repository.

For example, to build API server image

$ docker build -t ml-pipeline-api-server -f backend/Dockerfile .

Unit test

API server

Run unit test for the API server

cd backend/src/ && go test ./...

Frontend

TODO: add instruction

DSL

pip install ./dsl/ --upgrade && python ./dsl/tests/main.py
pip install ./dsl-compiler/ --upgrade && python ./dsl-compiler/tests/main.py

Integration test & E2E test

Check this page for more details.

Troubleshooting

Q: How to access to the database directly?

You can inspect mysql database directly by running:

kubectl run -it --rm --image=mysql:5.6 --restart=Never mysql-client -- mysql -h mysql
mysql> use mlpipeline;
mysql> select * from jobs;

Q: How to inspect object store directly?

Minio provides its own UI to inspect the object store directly:

kubectl port-forward -n ${NAMESPACE} $(kubectl get pods -l app=minio -o jsonpath='{.items[0].metadata.name}' -n ${NAMESPACE}) 9000:9000
Access Key:minio
Secret Key:minio123

Q: I see an error of exceeding Github rate limit when deploying the system. What can I do?

See Ksonnet troubleshooting page

Q: How do I check my API server log?

API server logs are located at /tmp directory of the pod. To SSH into the pod, run:

kubectl exec -it -n ${NAMESPACE} $(kubectl get pods -l app=ml-pipeline -o jsonpath='{.items[0].metadata.name}' -n ${NAMESPACE}) -- /bin/sh

or

kubectl logs -n ${NAMESPACE} $(kubectl get pods -l app=ml-pipeline -o jsonpath='{.items[0].metadata.name}' -n ${NAMESPACE})

Q: How to check my cluster status if I am using Minikube?

Minikube provides dashboard for deployment

minikube dashboard

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

developer_guide.md

developer_guide.md

ML Pipeline Development Guideline

ML pipeline deployment

Build Image

GKE

Minikube

Unit test

API server

Frontend

DSL

Integration test & E2E test

Troubleshooting

Files

developer_guide.md

Latest commit

History

developer_guide.md

File metadata and controls

ML Pipeline Development Guideline

ML pipeline deployment

Build Image

GKE

Minikube

Unit test

API server

Frontend

DSL

Integration test & E2E test

Troubleshooting