Skip to content

Commit

Permalink
[AWS SageMaker] Integ test to check CloudWatch logs print feature (#4056
Browse files Browse the repository at this point in the history
)

* Integ test for cw logs

* Update license file version to 0.5.3

* update version in yaml

* add changelog
  • Loading branch information
akartsky authored Jul 9, 2020
1 parent c6754e3 commit 799db47
Show file tree
Hide file tree
Showing 15 changed files with 43 additions and 13 deletions.
6 changes: 6 additions & 0 deletions components/aws/sagemaker/Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,12 @@ The version of the AWS SageMaker Components is determined by the docker image ta
Repository: https://hub.docker.com/repository/docker/amazon/aws-sagemaker-kfp-components

---------------------------------------------
**Change log for version 0.5.3**
- Add static error string in case of error fetching logs

> Pull requests : [#4056](https://github.com/kubeflow/pipelines/pull/4056)

**Change log for version 0.5.2**
- Modified outputs to use newer `outputPath` syntax

Expand Down
2 changes: 1 addition & 1 deletion components/aws/sagemaker/THIRD-PARTY-LICENSES.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
** Amazon SageMaker Components for Kubeflow Pipelines; version 0.5.2 --
** Amazon SageMaker Components for Kubeflow Pipelines; version 0.5.3 --
https://github.com/kubeflow/pipelines/tree/master/components/aws/sagemaker
Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
** boto3; version 1.12.33 -- https://github.com/boto/boto3/
Expand Down
2 changes: 1 addition & 1 deletion components/aws/sagemaker/batch_transform/component.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ outputs:
- {name: output_location, description: 'S3 URI of the transform job results.'}
implementation:
container:
image: amazon/aws-sagemaker-kfp-components:0.5.2
image: amazon/aws-sagemaker-kfp-components:0.5.3
command: ['python3']
args: [
batch_transform.py,
Expand Down
4 changes: 4 additions & 0 deletions components/aws/sagemaker/common/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@
import logging
logging.getLogger().setLevel(logging.INFO)

# this error message is used in integration tests
CW_ERROR_MESSAGE = 'Error in fetching CloudWatch logs for SageMaker job'

# Mappings are extracted from the first table in https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-algo-docker-registry-paths.html
built_in_algos = {
'blazingtext': 'blazingtext',
Expand Down Expand Up @@ -106,6 +109,7 @@ def print_logs_for_job(cw_client, log_grp, job_name):

logging.info('\n******************** End of CloudWatch logs for {} {} ********************\n'.format(log_grp, job_name))
except Exception as e:
logging.error(CW_ERROR_MESSAGE)
logging.error(e)


Expand Down
2 changes: 1 addition & 1 deletion components/aws/sagemaker/deploy/component.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ outputs:
- {name: endpoint_name, description: 'Endpoint name'}
implementation:
container:
image: amazon/aws-sagemaker-kfp-components:0.5.2
image: amazon/aws-sagemaker-kfp-components:0.5.3
command: ['python3']
args: [
deploy.py,
Expand Down
2 changes: 1 addition & 1 deletion components/aws/sagemaker/ground_truth/component.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ outputs:
- {name: active_learning_model_arn, description: 'The ARN for the most recent Amazon SageMaker model trained as part of automated data labeling.'}
implementation:
container:
image: amazon/aws-sagemaker-kfp-components:0.5.2
image: amazon/aws-sagemaker-kfp-components:0.5.3
command: ['python3']
args: [
ground_truth.py,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ outputs:
description: 'The registry path of the Docker image that contains the training algorithm'
implementation:
container:
image: amazon/aws-sagemaker-kfp-components:0.5.2
image: amazon/aws-sagemaker-kfp-components:0.5.3
command: ['python3']
args: [
hyperparameter_tuning.py,
Expand Down
2 changes: 1 addition & 1 deletion components/aws/sagemaker/model/component.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ outputs:
- {name: model_name, description: 'The model name SageMaker created'}
implementation:
container:
image: amazon/aws-sagemaker-kfp-components:0.5.2
image: amazon/aws-sagemaker-kfp-components:0.5.3
command: ['python3']
args: [
create_model.py,
Expand Down
2 changes: 1 addition & 1 deletion components/aws/sagemaker/process/component.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ outputs:
- {name: output_artifacts, description: 'A dictionary containing the output S3 artifacts'}
implementation:
container:
image: amazon/aws-sagemaker-kfp-components:0.5.2
image: amazon/aws-sagemaker-kfp-components:0.5.3
command: ['python3']
args: [
process.py,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from utils import minio_utils
from utils import sagemaker_utils
from utils import s3_utils
from utils import argo_utils


@pytest.mark.parametrize(
Expand Down Expand Up @@ -82,4 +83,7 @@ def test_transform_job(
)
assert s3_utils.check_object_exists(s3_client, s3_data_bucket, file_key)

assert not argo_utils.error_in_cw_logs(workflow_json["metadata"]["name"]), \
('Found the CloudWatch error message in the log output. Check SageMaker to see if the job has failed.')

utils.remove_dir(download_dir)
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
from utils import kfp_client_utils
from utils import minio_utils
from utils import sagemaker_utils

from utils import argo_utils

@pytest.mark.parametrize(
"test_file_dir",
Expand Down Expand Up @@ -85,4 +85,7 @@ def test_processingjob(
for output in process_response["ProcessingOutputConfig"]["Outputs"]:
assert processing_outputs[output["OutputName"]] == output["S3Output"]["S3Uri"]

assert not argo_utils.error_in_cw_logs(workflow_json["metadata"]["name"]), \
('Found the CloudWatch error message in the log output. Check SageMaker to see if the job has failed.')

utils.remove_dir(download_dir)
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
from utils import kfp_client_utils
from utils import minio_utils
from utils import sagemaker_utils
from utils import argo_utils


@pytest.mark.parametrize(
Expand Down Expand Up @@ -77,4 +78,7 @@ def test_trainingjob(
else:
assert f"dkr.ecr.{region}.amazonaws.com" in training_image

assert not argo_utils.error_in_cw_logs(workflow_json["metadata"]["name"]), \
('Found the CloudWatch error message in the log output. Check SageMaker to see if the job has failed.')

utils.remove_dir(download_dir)
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,16 @@


def print_workflow_logs(workflow_name):
output = utils.run_command(
f"argo logs {workflow_name} -n {utils.get_kfp_namespace()}"
)
output = get_workflow_logs(workflow_name)
print(f"workflow logs:\n", output.decode())

def find_in_logs(workflow_name, sub_str):
logs = get_workflow_logs(workflow_name).decode()
return logs.find(sub_str) >= 0

def get_workflow_logs(workflow_name):
return utils.run_command(f"argo logs {workflow_name} -n {utils.get_kfp_namespace()}")

def error_in_cw_logs(workflow_name):
ERROR_MESSAGE = 'Error in fetching CloudWatch logs for SageMaker job'
return find_in_logs(workflow_name, ERROR_MESSAGE)
2 changes: 1 addition & 1 deletion components/aws/sagemaker/train/component.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ outputs:
- {name: training_image, description: 'The registry path of the Docker image that contains the training algorithm'}
implementation:
container:
image: amazon/aws-sagemaker-kfp-components:0.5.2
image: amazon/aws-sagemaker-kfp-components:0.5.3
command: ['python3']
args: [
train.py,
Expand Down
2 changes: 1 addition & 1 deletion components/aws/sagemaker/workteam/component.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ outputs:
- {name: workteam_arn, description: 'The ARN of the workteam.'}
implementation:
container:
image: amazon/aws-sagemaker-kfp-components:0.5.2
image: amazon/aws-sagemaker-kfp-components:0.5.3
command: ['python3']
args: [
workteam.py,
Expand Down

0 comments on commit 799db47

Please sign in to comment.