Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: sidecar for ContainerOp #879

Merged
merged 15 commits into from
Mar 28, 2019

Conversation

eterna2
Copy link
Contributor

@eterna2 eterna2 commented Feb 28, 2019

Motivation

I am using kubeflow pipelines to run simulation experiments, and because of the complexity of the simulation setup, it requires supporting containers which I might want to parameterize.

Status

Waiting for Review

  • all unit tests are working
  • updated to work with latest master HEAD
  • TODO?: more comprehensive test on generating complex template with PipelineParams everywhere.

New or updated Features

This PR adds the following features to kfp package's dsl and compiler modules:

  • allows PipelineParam for all valid attributes/properties (both string serialized and object instance) in dsl.ContainerOp
  • allows sidecars for workflow template (via dsl.ContainerOp.add_sidecar) with dsl.Sidecar class
  • allows parameterization of container property for argo workflow template (via dsl.ContainerOp.container)
    (i.e. you can set any valid attributes/properties for a k8s V1Container object)

NOTE .
All changes are completely backward compatible. With appropriate pending deprecation warnings.

Example

from kfp import dsl
from kubernetes.client.models import V1EnvVar

@dsl.pipeline(
    name='foo',
    description='hello world')
def foo_pipeline(tag: str, pull_image_policy: str):

    # any attributes can be parameterized (both serialized string or actual PipelineParam)
    op = dsl.ContainerOp(name='foo', 
                                         image='busybox:' % tag,
                                         # pass in sidecars list
                                         sidecars=[dsl.Sidecar('print', 'busybox:latest', command='echo "hello"')],
                                         # pass in k8s container kwargs
                                         container_kwargs={'env': [V1EnvVar('foo', 'bar')]})

    # set `imagePullPolicy` property for `container` with `PipelineParam` 
    op.container.set_pull_image_policy(pull_image_policy)

    # add sidecar with parameterized image tag
    # sidecar follows the argo sidecar swagger spec
    op.add_sidecar(dsl.Sidecar('redis', 'redis:' % tag).set_image_pull_policy('Always'))

TLDR changes

This PR will make the following changes:

  • Created Container and Sidecar classes
    which inherits from V1Container

    • updated both classes with methods to update k8s properties
  • Updated ContainerOp

    • added kwargs: container_kwargs and sidecars
    • added attr _container (and getter container) which holds the Container instance
    • added pendingDeprecationWarning for container properties and methods
    • changed inputs to scan through all qualified attributes in ContainerOp to get all PipelineParam
  • Updated PipelineParam

    • added pattern attr to hold regex pattern extracted from serialized string
    • added method to recursively extract PipelineParam from any object
  • ContainerOp to template

    • moved _op_to_template to its own module
    • added method to recursively replace PipelineParam (and its serialized form) with appropriate input.parameters.%s
  • Added appropriate unit test for updates


This change is Reviewable

@k8s-ci-robot
Copy link
Contributor

Hi @eterna2. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

1 similar comment
@k8s-ci-robot
Copy link
Contributor

Hi @eterna2. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@animeshsingh animeshsingh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to understand the use case more. The idea here is that in a single step, we can launch multiple sidecar containers - synchronously or asynchronously? Why not kick off instances another pipeline from a step, if the step requires replicated efforts?

@eterna2
Copy link
Contributor Author

eterna2 commented Mar 1, 2019

@animeshsingh

Asynchronously. The sidecars are services rather than jobs (i.e. mq brokers, db, rest services, ...)

I am running a simulation job that is reliant on a few other services. If I provision these services as parallel steps, I would need to add additional scripts to send a termination signal to these services when the simulation job completes.

sdk/python/kfp/dsl/_container_op.py Outdated Show resolved Hide resolved
sdk/python/kfp/dsl/_container_op.py Outdated Show resolved Hide resolved
sdk/python/kfp/dsl/_container_op.py Outdated Show resolved Hide resolved
sdk/python/kfp/compiler/compiler.py Outdated Show resolved Hide resolved
@ukclivecox
Copy link
Contributor

Is it worth adding an example or test to illustrate how a sidecar can be added?

@eterna2
Copy link
Contributor Author

eterna2 commented Mar 5, 2019

@cliveseldon
Sure. I will add an @example as part of the pydoc comment for ContainerOp.

Copy link
Contributor

@animeshsingh animeshsingh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a test example to test the functionality?

sdk/python/kfp/dsl/__init__.py Outdated Show resolved Hide resolved


class ContainerOp(ContainerBase):
"""Represents an op implemented by a docker container image."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given docker is being phased out, may make sense to just call these 'container image'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given docker is being phased out, may make sense to just call these 'container image'

Can you please provide some link documenting that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well @Ark-kun - its phased out from IBM`s official Kubernetes service
https://www.ibm.com/blogs/bluemix/2019/01/ibm-cloud-kubernetes-service-supports-containerd/

Other providers are also discussing moving away - btw GCP lauched this in beta in Nov as well
https://cloud.google.com/blog/products/containers-kubernetes/containerd-available-for-beta-testing-in-google-kubernetes-engine

@texasmichelle
Copy link
Member

👍 to adding sidecar support. Thank you everyone for working on this!

@eterna2 eterna2 force-pushed the eterna2/kfp-containerOps-sidecar branch from 256e846 to 1f97e2f Compare March 13, 2019 11:38
… well as sidecars with Sidecar class. ContainerOp accepts PipelineParam in any valid k8 properties.
@eterna2 eterna2 force-pushed the eterna2/kfp-containerOps-sidecar branch from 1f97e2f to 383a4f5 Compare March 13, 2019 11:43
@hongye-sun
Copy link
Contributor

/lgtm
/approve

Travis has an outage right now. Since the e2e is passing, I will give the approve label and feel free to merge after travis build is passed.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hongye-sun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

1 similar comment
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hongye-sun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@hongye-sun
Copy link
Contributor

/test kubeflow-pipeline-sample-test

def _validate_cpu_string(self, cpu_string):
"Validate a given string is valid for cpu request or limit."

if isinstance(cpu_string, _pipeline_param.PipelineParam):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we extract this block of code as a separate internal function because it has been used by many functions here?

return value if isinstance(value, list) else [value]


def create_and_append(current_list: Union[List[T], None], item: T) -> List[T]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is internally used in this module, could we rename it such that it starts with an underscore?

self.image_pull_policy = image_pull_policy
return self

def add_port(self, container_port):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a TODO to support pipelineparam for the other configurations besides CPU, memory ,etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is already supported - in the compiler._op_to_template there is a recursive method to replace PipelineParam and its serialized form into argo variables.


if re.match(r'^[0-9]+m$', cpu_string) is not None:
return
# util functions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move the general utility functions into a separate module.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more container_op related but I think it would be helpful to define a decorator that helps for the future deprecation warnings.

@hongye-sun
Copy link
Contributor

/test kubeflow-pipeline-sample-test

@hongye-sun
Copy link
Contributor

/test kubeflow-pipeline-sample-test

@hongye-sun
Copy link
Contributor

/hold cancel

@hongye-sun
Copy link
Contributor

@eterna2

Thanks a lot for contributing to this PR. It helps to unlock many our use cases (not only just sidecar) for k8s resource orchestrations.

@vicaire
Copy link
Contributor

vicaire commented Mar 29, 2019

+1. Thank you @eterna2

Ark-kun added a commit to Ark-kun/pipelines that referenced this pull request Mar 29, 2019
Ark-kun added a commit that referenced this pull request Apr 4, 2019
Linchin pushed a commit to Linchin/pipelines that referenced this pull request Apr 11, 2023
Migrate xgboost-operator repo to new test-infra
magdalenakuhn17 pushed a commit to magdalenakuhn17/pipelines that referenced this pull request Oct 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants