-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update SageMaker components and sample pipeline #1682
Update SageMaker components and sample pipeline #1682
Conversation
Hi @carolynwang. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
parser.add_argument('--input_filter', type=str, required=False, help='A JSONPath expression used to select a portion of the input data to pass to the algorithm.', default='') | ||
parser.add_argument('--output_filter', type=str, required=False, help='A JSONPath expression used to select a portion of the joined dataset to save in the output file for a batch transform job.', default='') | ||
parser.add_argument('--join_source', choices=['None', 'Input', ''], required=False, help='Specifies the source of the data to join with the transformed data.', default='None') | ||
parser.add_argument('--instance_type', choices=['ml.m4.xlarge', 'ml.m4.2xlarge', 'ml.m4.4xlarge', 'ml.m4.10xlarge', 'ml.m4.16xlarge', 'ml.m5.large', 'ml.m5.xlarge', 'ml.m5.2xlarge', 'ml.m5.4xlarge', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: We need to consider how to better maintain this list. It could easily out of sync once SM rollout new instances
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to fetch supported instances using sagemaker api? like the way you get built-in algorithm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isnt a quick search through the sagemaker documentation
request['AlgorithmSpecification']['TrainingImage'] = args['image'] | ||
request['AlgorithmSpecification'].pop('AlgorithmName') | ||
else: | ||
# TODO: determine if users can make custom algorithm resources that have the same name as built-in algorithm names |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is custom algorithm supported here? Ideally, user should use BYOC or first party algorithms. No more options should be exposed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Custom algorithm is referring to algorithms that users create as an algorithm resource in SageMaker. The API has an AlgorithmName param for these custom algos and algos from AWS marketplace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Have you tried to create a custom algorithm using exist first-party algorithm name? I do think their API object is different or same name can not be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tried it, created an algo with the name 'K-Means' and it worked so I need to adjust this implementation
/cc @Jeffwan |
/retest |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Jeffwan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
See the rationale at https://github.com/hadolint/hadolint/wiki/DL3042 In addition to space saving, it also ensures that pip doesn't install from local cache (especially in case of a rebuild) and reduces the chance of creating bugs during development.
Summary
@Jeffwan @mbaijal @gautamkmr @kalyc
Added/Updated Dockerfiles for individual components
Updates to HPO component
Updates to train, create model, deploy, and batch transform components
Add HPO and new parameters to MNIST classification sample pipeline
Testing
Updated components functionality
This change is