Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample test improvement - using python fire to launch sample test #1897

Merged
merged 81 commits into from
Aug 23, 2019
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
e0c7132
Remove redundant import.
numerology Jul 23, 2019
70972db
Simplify sample_test.yaml by using withItem syntax.
Jul 31, 2019
deb6cc0
Merge branch 'master' of https://github.com/kubeflow/pipelines
Jul 31, 2019
da9a42a
Simplify sample_test.yaml by using withItem syntax.
Jul 31, 2019
a203039
Merge branch 'master' of https://github.com/numerology/pipelines
Jul 31, 2019
81ccbf4
Change dict to str in withItems.
Jul 31, 2019
d94a945
Merge branch 'master' of https://github.com/kubeflow/pipelines
Jul 31, 2019
ee38eb1
Merge branch 'master' of https://github.com/kubeflow/pipelines
Aug 7, 2019
9b3ad8e
Merge branch 'master' of https://github.com/kubeflow/pipelines
Aug 8, 2019
4ba8f85
Merge branch 'master' of https://github.com/kubeflow/pipelines
Aug 8, 2019
2b39278
Merge branch 'master' of https://github.com/kubeflow/pipelines
Aug 9, 2019
3061791
Merge branch 'master' of https://github.com/kubeflow/pipelines
Aug 9, 2019
1cc64ec
Merge remote-tracking branch 'origin/master'
Aug 11, 2019
a42ac40
Merge branch 'master' of https://github.com/kubeflow/pipelines
Aug 12, 2019
2398017
Merge remote-tracking branch 'origin/master'
Aug 12, 2019
a0bd790
Merge branch 'master' of https://github.com/kubeflow/pipelines
Aug 13, 2019
433180c
Merge branch 'master' of https://github.com/kubeflow/pipelines
Aug 15, 2019
fcabe77
Merge branch 'master' of https://github.com/kubeflow/pipelines
Aug 15, 2019
dfc0b16
Add back coveralls.
Aug 15, 2019
fb734de
Squashed commit of the following:
Aug 16, 2019
2b6907e
WIP: rewrite sample test infra
Aug 16, 2019
55a8667
WIP: rewrite sample test infra
Aug 16, 2019
e063bc2
WIP: rewrite sample test infra
Aug 16, 2019
17348a8
Fixing
Aug 16, 2019
b77703b
WIP: add injection function in python launcher.
Aug 19, 2019
797b618
Refactor the logic of run_test.sh into sample_test_launcher.py
Aug 20, 2019
dcdefd3
Fix exit_code
Aug 20, 2019
aa12ff1
Fix imports and add todo
Aug 20, 2019
f85e10a
Squashed commit of the following:
Aug 20, 2019
34d25d4
Merge branch 'master' of https://github.com/kubeflow/pipelines into t…
Aug 20, 2019
de74804
Clean up.
Aug 20, 2019
b685c62
Fix launcher flag.
Aug 20, 2019
d1994d6
switch to sys.execxutable.
Aug 20, 2019
57763d0
Fix bugs
Aug 20, 2019
d3db0bc
Add copy blob util function
Aug 20, 2019
da3a5dd
Fix default image prefix.
Aug 20, 2019
950035f
Fix dependencies.
Aug 20, 2019
905610b
Fix gs op
Aug 20, 2019
1dab733
Lint
Aug 20, 2019
8c814ba
Fix papermill exec
Aug 20, 2019
ff26024
Init for componentTest
Aug 21, 2019
bc0d319
Debugging.
Aug 21, 2019
5dcc495
Try to fix the issue.
Aug 21, 2019
fffee19
Fix exit code
Aug 21, 2019
748e9d9
Another fix.
Aug 21, 2019
5e7a8c5
Squashed commit of the following:
Aug 21, 2019
c902070
Fix working dir
Aug 21, 2019
8ef3fa1
Fix exit code format
Aug 21, 2019
3a33af2
Merge branch 'master' of https://github.com/kubeflow/pipelines into t…
Aug 21, 2019
68a4089
Remove kfp notebook sample using tfx:oss components.
Aug 21, 2019
3de8843
Refactor to reduce potential dup code.
Aug 21, 2019
ad90563
Clear unused const.
Aug 21, 2019
fc6301a
Fix redundant check
Aug 21, 2019
23349b1
Add image injection for component test.
Aug 21, 2019
ad01f7d
Fix unused import
Aug 21, 2019
73b1409
Merge branch 'master' of https://github.com/kubeflow/pipelines into t…
Aug 21, 2019
f5a857d
Squash and merge from master
Aug 21, 2019
bed1135
Squashed commit of the following:
Aug 22, 2019
14dfd4e
# Conflicts:
Aug 22, 2019
a08fb85
Merge branch 'master' of https://github.com/kubeflow/pipelines into t…
Aug 22, 2019
e9b0cd8
Fix version for papermill
Aug 22, 2019
04546c5
Merge remote-tracking branch 'origin/test-improvement' into test-impr…
Aug 22, 2019
35ae6f5
Squashed commit of the following:
Aug 22, 2019
6fc6252
Merge branch 'master' of https://github.com/kubeflow/pipelines into t…
Aug 22, 2019
c5640bf
Fix google-cloud-storage and google fire version info.
Aug 22, 2019
d6a29b9
Delete run_test.sh
Aug 22, 2019
ad177e5
Merge from kfp master
Aug 23, 2019
0104c43
Squashed commit of the following:
Aug 23, 2019
364af5c
Merge branch 'master' of https://github.com/kubeflow/pipelines into t…
Aug 23, 2019
58937af
Fix check_notebook_results args.
Aug 23, 2019
8f21a36
Fix.
Aug 23, 2019
85f5391
Merge branch 'master' of https://github.com/kubeflow/pipelines into t…
Aug 23, 2019
e8694ab
Fix work dir path
Aug 23, 2019
9aed040
Fix global const naming convention
Aug 23, 2019
b8129b4
Remove redundant check.
Aug 23, 2019
7e627aa
Extract project name from result gcs dir.
Aug 23, 2019
0b3c323
Add error catching
Aug 23, 2019
2c07792
Name change.
Aug 23, 2019
911ceb3
Improve error catching.
Aug 23, 2019
55533a2
Improve error catching.
Aug 23, 2019
ee66780
naming correction.
Aug 23, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion test/sample-test/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ RUN pip3 install setuptools==40.5.0
RUN pip3 install papermill==0.16.1
gaoning777 marked this conversation as resolved.
Show resolved Hide resolved
RUN pip3 install ipykernel==5.1.0
RUN pip3 install google-api-python-client==1.7.0
RUN pip3 install fire

# Install python client, including DSL compiler.
COPY ./sdk/python /sdk/python
Expand All @@ -31,4 +32,4 @@ RUN ARGO_VERSION=v2.3.0 && curl -sSL -o /usr/local/bin/argo \
chmod +x /usr/local/bin/argo
ENV PATH $PATH:/usr/local/bin/argo

ENTRYPOINT ["/python/src/github.com/kubeflow/pipelines/test/sample-test/run_test.sh"]
ENTRYPOINT ["python3", "/python/src/github.com/kubeflow/pipelines/test/sample-test/sample_test_launcher.py", "sample_test"]
227 changes: 227 additions & 0 deletions test/sample-test/sample_test_launcher.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,227 @@
# Copyright 2019 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
This launcher module serves as the entry-point of the sample test image. It
decides which test to trigger based upon the arguments provided.
"""

import fire
import os
import papermill as pm
import subprocess
import uuid

from google.cloud import storage


# List of notebook samples' test names and corresponding file names.
NOTEBOOK_SAMPLES = {
'kubeflow_pipeline_using_TFX_OSS_components': 'KubeFlow Pipeline Using TFX OSS Components.ipynb',
'lightweight_component': 'Lightweight Python components - basics.ipynb',
'dsl_static_type_checking': 'DSL Static Type Checking.ipynb'
}


PROJECT_NAME = 'ml-pipeline-test'
PAPERMILL_ERR_MSG = 'An Exception was encountered at'


#TODO(numerology): Add unit-test for classes.
class SampleTest(object):
"""Launch a KFP sample_test provided its name.

Args:
test_name: name of the sample test.
input: The path of a pipeline package that will be submitted.
result: The path of the test result that will be exported.
output: The path of the test output.
namespace: Namespace of the deployed pipeline system. Default: kubeflow
"""

GITHUB_REPO = 'kubeflow/pipelines'
BASE_DIR= '/python/src/github.com/' + GITHUB_REPO
TEST_DIR = BASE_DIR + '/test/sample-test'

def __init__(self, test_name, test_results_gcs_dir, target_image_prefix,
namespace='kubeflow'):
self._test_name = test_name
self._results_gcs_dir = test_results_gcs_dir
#(TODO: numerology) target_image_prefix seems to be only used for post-submit
# check.
self._target_image_prefix = target_image_prefix
self._namespace = namespace
self._sample_test_result = 'junit_Sample%sOutput.xml' % self._test_name
self._sample_test_output = self._results_gcs_dir
self._work_dir = self.BASE_DIR + '/samples/core/' + self._test_name

self._run_test()

def check_result(self):
os.chdir(self.TEST_DIR)
subprocess.call([
'python3',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: sys.executable is a bit better since it points to the working python binary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Will do.

'run_sample_test.py',
'--input',
input,
'--result',
self._sample_test_result,
'--output',
self._sample_test_output,
'--testname',
self._test_name,
'--namespace',
self._namespace
])
print('Copy the test results to GCS %s/' % self._results_gcs_dir)
storage_client = storage.Client()
working_bucket = PROJECT_NAME

src_bucket = storage_client.get_bucket(working_bucket)
dest_bucket =src_bucket # Currently copy to the same bucket.
src_bucket.copy_blob(
self._sample_test_result,
dest_bucket,
self._results_gcs_dir + '/' + self._sample_test_result)

def check_notebook_result(self):
# Workaround because papermill does not directly return exit code.
exit_code = 1 if PAPERMILL_ERR_MSG in \
open('%s.ipynb' % self._test_name).read() else 0
if self._test_name == 'dsl_static_type_checking':
subprocess.call([
'python3',
'check_notebook_results.py',
'--testname',
self._test_name,
'--result',
self._sample_test_result,
'--exit-code',
exit_code
])
else:
subprocess.call([
'python3',
'check_notebook_results.py',
'--testname',
self._test_name,
'--result',
self._sample_test_result,
'--namespace',
self._namespace,
'--exit-code',
exit_code
])

print('Copy the test results to GCS %s/' % self._results_gcs_dir)
storage_client = storage.Client()
working_bucket = PROJECT_NAME

src_bucket = storage_client.get_bucket(working_bucket)
dest_bucket =src_bucket # Currently copy to the same bucket.
src_bucket.copy_blob(
self._sample_test_result,
dest_bucket,
self._results_gcs_dir + '/' + self._sample_test_result)

def _run_test(self):
gaoning777 marked this conversation as resolved.
Show resolved Hide resolved
if len(self._results_gcs_dir) == 0:
numerology marked this conversation as resolved.
Show resolved Hide resolved
return 1

# variables needed for sample test logic.
input = '%s/%s.yaml' % (self._work_dir, self._test_name)
test_cases = [] # Currently, only capture run-time error, no result check.
sample_test_name = self._test_name + ' Sample Test'

os.chdir(self._work_dir)
print('Run the sample tests...')

# For presubmit check, do not do any image injection as for now.
# Notebook samples need to be papermilled first.
if self._test_name == 'kubeflow_pipeline_using_TFX_OSS_components':
bucket_prefix = 'gs://ml-pipeline-dataset/sample-test/taxi-cab-classification/'
pm.execute_notebook(
input_path='KubeFlow Pipeline Using TFX OSS Components.ipynb',
output_path='%s.ipynb' % self._test_name,
prepare_only=False,
parameters=dict(
EXPERIMENT_NAME='%s-test' % self._test_name,
OUTPUT_DIR=self._results_gcs_dir,
PROJECT_NAME=PROJECT_NAME,
BASE_IMAGE='%spusherbase:dev' % self._target_image_prefix,
TARGET_IMAGE='%spusher:dev' % self._target_image_prefix,
TARGET_IMAGE_TWO='%spusher_two:dev' % self._target_image_prefix,
KFP_PACKAGE='tmp/kfp.tar.gz',
DEPLOYER_MODEL='Notebook_tfx_taxi_%s' % uuid.uuid1(),
TRAINING_DATA=bucket_prefix + 'train50.csv',
EVAL_DATA=bucket_prefix + 'eval20.csv',
HIDDEN_LAYER_SIZE=10,
STEPS=50
)
)
self.check_notebook_result()
elif self._test_name == 'lightweight_component':
pm.execute_notebook(
input_path='Lightweight Python components - basics.ipynb',
output_path='%s.ipynb' % self._test_name,
prepare_only=False,
parameters=dict(
EXPERIMENT_NAME='%s-test' % self._test_name,
PROJECT_NAME=PROJECT_NAME,
KFP_PACKAGE='tmp/kfp.tar.gz',
)
)
self.check_notebook_result()
elif self._test_name == 'dsl_static_type_checking':
pm.execute_notebook(
input_path='DSL Static Type Checking.ipynb',
output_path='%s.ipynb' % self._test_name,
prepare_only=False,
parameters=dict(
KFP_PACKAGE='tmp/kfp.tar.gz',
gaoning777 marked this conversation as resolved.
Show resolved Hide resolved
)
)
self.check_notebook_result()
else:
subprocess.call(['dsl-compile', '--py', '%s.py' % self._test_name,
'--output', '%s.yaml' % self._test_name])
self.check_result()


class ComponentTest(SampleTest):
""" Launch a KFP sample test as component test provided its name.

Currently follows the same logic as sample test for compatibility.
"""
def __init__(self, test_name, input, result, output,
result_gcs_dir, target_image_prefix, dataflow_tft_image,
namespace='kubeflow'):
super().__init__(test_name, input, result, output, namespace)
#TODO(numerology): finish this.
pass

def main():
"""Launches either KFP sample test or component test as a command entrypoint.

Usage:
python sample_test_launcher.py sample_test arg1 arg2 to launch sample test, and
python sample_test_launcher.py component_test arg1 arg2 to launch component
test.
"""
fire.Fire({
'sample_test': SampleTest,
'component_test': ComponentTest
})

if __name__ == '__main__':
main()
28 changes: 26 additions & 2 deletions test/sample-test/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,12 @@
# See the License for the specific language governing permissions and
# limitations under the Licens

import os
import re
import subprocess

from minio import Minio
from junit_xml import TestSuite, TestCase
import subprocess

# Parse the workflow json to obtain the artifacts for a particular step.
# Note: the step_name could be the key words.
Expand Down Expand Up @@ -59,4 +62,25 @@ def run_bash_command(cmd):
output_string = output_bytes.decode('utf-8')
if error_bytes != None:
error_string = error_bytes.decode('utf-8')
return output_string, error_string
return output_string, error_string


def file_injection(file_in, file_out, subs):
"""Utility function that substitute several regex within a file by
corresponding string.

:param file_in: input file name.
:param file_out: output file name.
:param subs: dict, key is the regex expr, value is the substituting string.
"""
with open(file_in, 'rt') as fin:
with open(file_out, 'wt') as fout:
for line in fin:
tmp_line = line
for old, new in subs.items():
regex = re.compile(old)
tmp_line = re.sub(regex, new)

fout.write(tmp_line)

os.rename(file_out, file_in)