This directory contains tools for automatically deploying instances of Kubeflow in codelab test projects.
The scripts are intended to run on a kubeflow cluster.
project: kf-codelab-admin cluster: codelab-admin-v06
- This is a Kubeflow v0.6 cluster
- This is an attempt to work around the problems we are seeing with workload identity enabled clusters namespace: kubeflow-jlewi * Use this namespace because this has K8s secrets with the GCP service account credentials
cluster: codelab-admin
- This is a Kubeflow v0.7 cluster with workload identity
- We are seeing auth problems related to the GKE node metadata servers bucket: gs://kf-codelab-admin OAuth location: gs://kf-codelab-admin/test-project-iap.oauth.yaml *GSA:
In order for to modify the codelab project you need to grant OWNER privileges to the GCP service account used by the K8s job
- We use the GSA
- Which is mapped to the default-editor KSA
If necessary modify setup-codelab-project.yaml to configure how each Kubeflow instance will be deployed.
This YAML file defines a K8s job which is used as a template for each K8s job that is created to setup a Kubeflow instance
This K8s job uses
to deploy Kubeflow -
The arguments of
control how Kubeflow is deployed and you may want to change them -
The most important parameters are
- kfname The name of the Kubeflow deployment
- kfctl_path The URL of the kfctl binary to use to deploy Kubeflow
- Its also possible to build kfctl from a specific commit but that's slower
- kfctl_config The URL of the KFDef manifest used for each deployment
- zone The zone to deploy in
Modify bulk-deploy.yaml to configure a K8s job to run bulk deployment
Set the following command line arguments in the YAML file
- --project-base-name* The base name of the codelab project (should end with a hyphen)
- --start-index The start index for generating project names
- --end-index The end for the range (non-inclusive)
Launch a K8s job running bulk deploy
kubectl create -f bulk-deploy.yaml
- This job will launch one K8s job for each Kubeflow deployment
- All of the launched K8s jobs will have the same value for the group label
- The bulk-deploy job will wait for all of the jobs in the group to finish
Run a K8s job to check whether each Kubeflow deployment has an endpoint that is accessible
Set the following command line arguments
- --kfname The name for Kubeflow deployments
- --project-base-name* The base name of the codelab project (should end with a hyphen)
- --start-index The start index for generating project names
- --end-index The end for the range (non-inclusive)
* Launch the job
kubectl create -f test-codelab-endpoints.yaml
* The job will print out which projects in the CSV file have accessible Kubeflow deployments
Modify delete-codelab-endpoints.yaml
Set the following command line arguments
- --kfname The name for Kubeflow deployments
- --project-base-name* The base name of the codelab project (should end with a hyphen)
- --start-index The start index for generating project names
- --end-index The end for the range (non-inclusive)
Create the job
kubectl create -f delete-codelab-endpoints.yaml
Get all deploy jobs for a specific project
kubectl -n kubeflow-jlewi get pods -l project=${PROJECT} --sort-by=.metadata.creationTimestamp