Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switching test to kubeflow deployment #351

Merged
merged 52 commits into from
Nov 29, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
1c7f2c9
test
IronPan Nov 21, 2018
213795b
fix
IronPan Nov 21, 2018
47d5457
fix
IronPan Nov 21, 2018
0ea4e69
fix
IronPan Nov 21, 2018
e2f5ea1
fix
IronPan Nov 21, 2018
bf05e06
fix
IronPan Nov 21, 2018
b83c4c1
update
IronPan Nov 21, 2018
21f88d0
cleanup
IronPan Nov 21, 2018
8c3f83f
fix
IronPan Nov 21, 2018
b78dc1b
coopy test
IronPan Nov 21, 2018
1323ab4
chmod
IronPan Nov 21, 2018
ab93fe8
fix
IronPan Nov 21, 2018
39d1712
fix
IronPan Nov 21, 2018
f694670
fix
IronPan Nov 21, 2018
1974bd3
fix
IronPan Nov 21, 2018
61a8ef4
fix
IronPan Nov 21, 2018
33cad5b
fix
IronPan Nov 21, 2018
97de358
fix
IronPan Nov 21, 2018
08a54ef
fix
IronPan Nov 21, 2018
6afcc66
fix
IronPan Nov 21, 2018
fa492e7
fix
IronPan Nov 21, 2018
7bd127a
fix
IronPan Nov 22, 2018
8b37fa2
fix
IronPan Nov 22, 2018
1d55b6f
fix
IronPan Nov 22, 2018
0776bf6
fix
IronPan Nov 22, 2018
fae10e9
fix
IronPan Nov 22, 2018
f198ccb
fix
IronPan Nov 22, 2018
ed04d33
fix
IronPan Nov 22, 2018
bafc268
fix
IronPan Nov 22, 2018
2ef2610
fix
IronPan Nov 22, 2018
41979ba
fix
IronPan Nov 22, 2018
f449107
fix
IronPan Nov 22, 2018
1b86ad0
fix
IronPan Nov 22, 2018
7c19bc9
fix
IronPan Nov 22, 2018
84bbc39
fix
IronPan Nov 22, 2018
1beaae7
fix
IronPan Nov 22, 2018
bb6420d
update
IronPan Nov 22, 2018
2fde6f4
fix
IronPan Nov 22, 2018
4a98037
fix
IronPan Nov 22, 2018
ab647bf
fix
IronPan Nov 22, 2018
d2e073d
fix
IronPan Nov 22, 2018
ed6b9e8
fix
IronPan Nov 22, 2018
da34b8d
fix
IronPan Nov 22, 2018
75c89c9
fix
IronPan Nov 22, 2018
63c9832
fix sample test
IronPan Nov 22, 2018
5c73e01
fix
IronPan Nov 22, 2018
25a49d9
fix
IronPan Nov 22, 2018
a5a14dc
merge
IronPan Nov 28, 2018
2701495
merge
IronPan Nov 28, 2018
94edfa8
update image builder image
IronPan Nov 28, 2018
2fdb5c7
update script
IronPan Nov 29, 2018
776034f
mount permission
IronPan Nov 29, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fix
  • Loading branch information
IronPan committed Nov 21, 2018
commit ab93fe8901f4a632a8882a3eaaa2c259e65ddbbf
110 changes: 62 additions & 48 deletions test/presubmit-tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,13 @@ usage()
[--workflow_file the file name of the argo workflow to run]
[--test_result_bucket the gcs bucket that argo workflow store the result to. Default is ml-pipeline-test
[--test_result_folder the gcs folder that argo workflow store the result to. Always a relative directory to gs://<gs_bucket>/[PULL_SHA]]
[--cluster-type the type of cluster to use for the tests. One of: create-gke,none. Default is create-gke ]
[--timeout timeout of the tests in seconds. Default is 1800 seconds. ]
[-h help]"
}

PROJECT=ml-pipeline-test
TEST_RESULT_BUCKET=ml-pipeline-test
GCR_IMAGE_BASE_DIR=gcr.io/ml-pipeline-test/${PULL_PULL_SHA}
CLUSTER_TYPE=create-gke
TIMEOUT_SECONDS=1800

while [ "$1" != "" ]; do
Expand All @@ -44,9 +42,6 @@ while [ "$1" != "" ]; do
--test_result_folder ) shift
TEST_RESULT_FOLDER=$1
;;
--cluster-type ) shift
CLUSTER_TYPE=$1
;;
--timeout ) shift
TIMEOUT_SECONDS=$1
;;
Expand All @@ -71,58 +66,77 @@ echo "presubmit test starts"
gcloud auth activate-service-account --key-file="${GOOGLE_APPLICATION_CREDENTIALS}"
gcloud config set compute/zone us-central1-a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we hard-coding the zone?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently we only requested quota for this zone.


if [ "$CLUSTER_TYPE" == "create-gke" ]; then
echo "create test cluster"
TEST_CLUSTER_PREFIX=${WORKFLOW_FILE%.*}
TEST_CLUSTER=${TEST_CLUSTER_PREFIX//_}-${PULL_PULL_SHA:0:10}-${RANDOM}

function delete_cluster {
echo "Delete cluster..."
gcloud container clusters delete ${TEST_CLUSTER} --async
}
trap delete_cluster EXIT

gcloud config set project ml-pipeline-test
gcloud config set compute/zone us-central1-a
gcloud container clusters create ${TEST_CLUSTER} \
--scopes cloud-platform \
--enable-cloud-logging \
--enable-cloud-monitoring \
--machine-type n1-standard-2 \
--num-nodes 3 \
--network test \
--subnetwork test-1

gcloud container clusters get-credentials ${TEST_CLUSTER}
fi

kubectl config set-context $(kubectl config current-context) --namespace=default
echo "Add necessary cluster role bindings"
ACCOUNT=$(gcloud info --format='value(config.account)')
kubectl create clusterrolebinding PROW_BINDING --clusterrole=cluster-admin --user=$ACCOUNT
kubectl create clusterrolebinding DEFAULT_BINDING --clusterrole=cluster-admin --serviceaccount=default:default

echo "install argo"
ARGO_VERSION=v2.2.0
mkdir -p ~/bin/
export PATH=~/bin/:$PATH
curl -sSL -o ~/bin/argo https://github.com/argoproj/argo/releases/download/$ARGO_VERSION/argo-linux-amd64
chmod +x ~/bin/argo
kubectl create ns argo
kubectl apply -n argo -f https://raw.githubusercontent.com/argoproj/argo/$ARGO_VERSION/manifests/install.yaml
# Install ksonnet
KS_VERSION="0.11.0"
curl -LO https://github.com/ksonnet/ksonnet/releases/download/v${KS_VERSION}/ks_${KS_VERSION}_linux_amd64.tar.gz
tar -xzf ks_${KS_VERSION}_linux_amd64.tar.gz
chmod +x ./ks_${KS_VERSION}_linux_amd64/ks
mv ./ks_${KS_VERSION}_linux_amd64/ks /usr/local/bin/

# Install kubeflow
KUBEFLOW_MASTER=$(pwd)/kubeflow_master
git clone https://github.com/kubeflow/kubeflow.git ${KUBEFLOW_MASTER}

## Download latest release source code
KUBEFLOW_SRC=$(pwd)/kubeflow_latest_release
mkdir ${KUBEFLOW_SRC}
cd ${KUBEFLOW_SRC}
export KUBEFLOW_TAG=v0.3.1
curl https://raw.githubusercontent.com/kubeflow/kubeflow/${KUBEFLOW_TAG}/scripts/download.sh | bash

## Override the pipeline config with code from master
cp -r ${KUBEFLOW_MASTER}/kubeflow/pipeline ${KUBEFLOW_SRC}/kubeflow/pipeline
cp -r ${KUBEFLOW_MASTER}/kubeflow/argo ${KUBEFLOW_SRC}/kubeflow/argo

TEST_CLUSTER_PREFIX=${WORKFLOW_FILE%.*}
TEST_CLUSTER=$(echo $TEST_CLUSTER_PREFIX | cut -d _ -f 1)-${PULL_PULL_SHA:0:7}-${RANDOM}

export CLIENT_ID=${RANDOM}
export CLIENT_SECRET=${RANDOM}
KFAPP=$(pwd)/${TEST_CLUSTER}

function clean_up {
echo "Clean up..."
cd ${KFAPP}
${KUBEFLOW_SRC}/scripts/kfctl.sh delete all
}
# trap clean_up EXIT

${KUBEFLOW_SRC}/scripts/kfctl.sh init ${KFAPP} --platform gcp --project ${PROJECT}
cd ${KFAPP}
echo "********* see generate platform"
${KUBEFLOW_SRC}/scripts/kfctl.sh generate platform
echo "********* see apply platform"
${KUBEFLOW_SRC}/scripts/kfctl.sh apply platform
echo "********* see generate k8s"
${KUBEFLOW_SRC}/scripts/kfctl.sh generate k8s
echo "********* see apply k8s "

## Update pipeline component image
pushd ks_app
ks param set pipeline apiImage ${GCR_IMAGE_BASE_DIR}/api:${PULL_PULL_SHA}
ks param set pipeline persistenceAgentImage ${GCR_IMAGE_BASE_DIR}/persistenceagent:${PULL_PULL_SHA}
ks param set pipeline scheduledWorkflowImage ${GCR_IMAGE_BASE_DIR}/scheduledworkflow:${PULL_PULL_SHA}
ks param set pipeline uiImage ${GCR_IMAGE_BASE_DIR}/frontend:${PULL_PULL_SHA}
popd

echo "********* parameter done"

${KUBEFLOW_SRC}/scripts/kfctl.sh apply k8s

echo "********* apllied done"

gcloud container clusters get-credentials ${TEST_CLUSTER}

echo "submitting argo workflow for commit ${PULL_PULL_SHA}..."
ARGO_WORKFLOW=`argo submit $(dirname $0)/${WORKFLOW_FILE} \
-p commit-sha="${PULL_PULL_SHA}" \
-p test-results-gcs-dir="${TEST_RESULTS_GCS_DIR}" \
-p cluster-type="${CLUSTER_TYPE}" \
-p api-image="${GCR_IMAGE_BASE_DIR}/api" \
-p frontend-image="${GCR_IMAGE_BASE_DIR}/frontend" \
-p scheduledworkflow-image="${GCR_IMAGE_BASE_DIR}/scheduledworkflow" \
-p persistenceagent-image="${GCR_IMAGE_BASE_DIR}/persistenceagent" \
-o name
`
echo argo workflow submitted successfully

DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" > /dev/null && pwd)"
source "${DIR}/check-argo-status.sh"

Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,15 @@ usage()
[--workflow_file the file name of the argo workflow to run]
[--test_result_bucket the gcs bucket that argo workflow store the result to. Default is ml-pipeline-test
[--test_result_folder the gcs folder that argo workflow store the result to. Always a relative directory to gs://<gs_bucket>/[PULL_SHA]]
[--cluster-type the type of cluster to use for the tests. One of: create-gke,none. Default is create-gke ]
[--timeout timeout of the tests in seconds. Default is 1800 seconds. ]
[-h help]"
}

PROJECT=ml-pipeline-test
TEST_RESULT_BUCKET=ml-pipeline-test
GCR_IMAGE_BASE_DIR=gcr.io/ml-pipeline-test/${PULL_PULL_SHA}
CLUSTER_TYPE=create-gke
TIMEOUT_SECONDS=1800

while [ "$1" != "" ]; do
Expand All @@ -42,6 +44,9 @@ while [ "$1" != "" ]; do
--test_result_folder ) shift
TEST_RESULT_FOLDER=$1
;;
--cluster-type ) shift
CLUSTER_TYPE=$1
;;
--timeout ) shift
TIMEOUT_SECONDS=$1
;;
Expand All @@ -66,69 +71,58 @@ echo "presubmit test starts"
gcloud auth activate-service-account --key-file="${GOOGLE_APPLICATION_CREDENTIALS}"
gcloud config set compute/zone us-central1-a

# Install ksonnet
KS_VERSION="0.11.0"
curl -LO https://github.com/ksonnet/ksonnet/releases/download/v${KS_VERSION}/ks_${KS_VERSION}_linux_amd64.tar.gz
tar -xzf ks_${KS_VERSION}_linux_amd64.tar.gz
chmod +x ./ks_${KS_VERSION}_linux_amd64/ks
mv ./ks_${KS_VERSION}_linux_amd64/ks /usr/local/bin/

# Install kubeflow
KUBEFLOW_MASTER=$(pwd)/kubeflow_master
git clone https://github.com/kubeflow/kubeflow.git ${KUBEFLOW_MASTER}

## Download latest release source code
KUBEFLOW_SRC=$(pwd)/kubeflow_latest_release
mkdir ${KUBEFLOW_SRC}
cd ${KUBEFLOW_SRC}
export KUBEFLOW_TAG=v0.3.1
curl https://raw.githubusercontent.com/kubeflow/kubeflow/${KUBEFLOW_TAG}/scripts/download.sh | bash

## Override the pipeline config with code from master
cp -r ${KUBEFLOW_MASTER}/kubeflow/pipeline ${KUBEFLOW_SRC}/kubeflow/pipeline
cp -r ${KUBEFLOW_MASTER}/kubeflow/argo ${KUBEFLOW_SRC}/kubeflow/argo

TEST_CLUSTER_PREFIX=${WORKFLOW_FILE%.*}
TEST_CLUSTER=$(echo $TEST_CLUSTER_PREFIX | cut -d _ -f 1)-${PULL_PULL_SHA:0:7}-${RANDOM}

export CLIENT_ID=${RANDOM}
export CLIENT_SECRET=${RANDOM}
KFAPP=$(pwd)/${TEST_CLUSTER}

function clean_up {
echo "Clean up..."
cd ${KFAPP}
${KUBEFLOW_SRC}/scripts/kfctl.sh delete all
}
# trap clean_up EXIT

${KUBEFLOW_SRC}/scripts/kfctl.sh init ${KFAPP} --platform gcp --project ${PROJECT}
cd ${KFAPP}
${KUBEFLOW_SRC}/scripts/kfctl.sh generate platform
${KUBEFLOW_SRC}/scripts/kfctl.sh apply platform
${KUBEFLOW_SRC}/scripts/kfctl.sh generate k8s

## Update pipeline component image
pushd ks_app
ks param set pipeline apiImage ${GCR_IMAGE_BASE_DIR}/api:${PULL_PULL_SHA}
ks param set pipeline persistenceAgentImage ${GCR_IMAGE_BASE_DIR}/persistenceagent:${PULL_PULL_SHA}
ks param set pipeline scheduledWorkflowImage ${GCR_IMAGE_BASE_DIR}/scheduledworkflow:${PULL_PULL_SHA}
ks param set pipeline uiImage ${GCR_IMAGE_BASE_DIR}/frontend:${PULL_PULL_SHA}
popd

${KUBEFLOW_SRC}/scripts/kfctl.sh apply k8s

gcloud container clusters get-credentials ${TEST_CLUSTER}
if [ "$CLUSTER_TYPE" == "create-gke" ]; then
echo "create test cluster"
TEST_CLUSTER_PREFIX=${WORKFLOW_FILE%.*}
TEST_CLUSTER=${TEST_CLUSTER_PREFIX//_}-${PULL_PULL_SHA:0:10}-${RANDOM}

function delete_cluster {
echo "Delete cluster..."
gcloud container clusters delete ${TEST_CLUSTER} --async
}
trap delete_cluster EXIT

gcloud config set project ml-pipeline-test
gcloud config set compute/zone us-central1-a
gcloud container clusters create ${TEST_CLUSTER} \
--scopes cloud-platform \
--enable-cloud-logging \
--enable-cloud-monitoring \
--machine-type n1-standard-2 \
--num-nodes 3 \
--network test \
--subnetwork test-1

gcloud container clusters get-credentials ${TEST_CLUSTER}
fi

kubectl config set-context $(kubectl config current-context) --namespace=default
echo "Add necessary cluster role bindings"
ACCOUNT=$(gcloud info --format='value(config.account)')
kubectl create clusterrolebinding PROW_BINDING --clusterrole=cluster-admin --user=$ACCOUNT
kubectl create clusterrolebinding DEFAULT_BINDING --clusterrole=cluster-admin --serviceaccount=default:default

echo "install argo"
ARGO_VERSION=v2.2.0
mkdir -p ~/bin/
export PATH=~/bin/:$PATH
curl -sSL -o ~/bin/argo https://github.com/argoproj/argo/releases/download/$ARGO_VERSION/argo-linux-amd64
chmod +x ~/bin/argo
kubectl create ns argo
kubectl apply -n argo -f https://raw.githubusercontent.com/argoproj/argo/$ARGO_VERSION/manifests/install.yaml

echo "submitting argo workflow for commit ${PULL_PULL_SHA}..."
ARGO_WORKFLOW=`argo submit $(dirname $0)/${WORKFLOW_FILE} \
-p commit-sha="${PULL_PULL_SHA}" \
-p test-results-gcs-dir="${TEST_RESULTS_GCS_DIR}" \
-p cluster-type="${CLUSTER_TYPE}" \
-p api-image="${GCR_IMAGE_BASE_DIR}/api" \
-p frontend-image="${GCR_IMAGE_BASE_DIR}/frontend" \
-p scheduledworkflow-image="${GCR_IMAGE_BASE_DIR}/scheduledworkflow" \
-p persistenceagent-image="${GCR_IMAGE_BASE_DIR}/persistenceagent" \
-o name
`
echo argo workflow submitted successfully

DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" > /dev/null && pwd)"
source "${DIR}/check-argo-status.sh"