Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDK/Compiler - Added the ability to apply a function to all ops in a pipeline #1209

Conversation

Ark-kun
Copy link
Contributor

@Ark-kun Ark-kun commented Apr 23, 2019

They can be used to apply some functions (e.g. to add secrets) to all pipeline ops.


This change is Reviewable

They can be used to apply some functions (e.g. to add secrets) to all pipeline ops.
@gaoning777
Copy link
Contributor

Would you mind explaining why we need to add the transformers?

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Apr 24, 2019

Would you mind explaining why we need to add the transformers?

"They can be used to apply some functions (e.g. to add secrets) to all pipeline ops."

So, it's basically apply_to_all_ops.
Some people want to simplify their pipelines a bit and stop applying same functions many times.

@gaoning777
Copy link
Contributor

I think the exposed interface should be in the DSL instead of the compiler. In other words, this interface should be exposed in the DSL at best. WDYT?

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Apr 25, 2019

I think the exposed interface should be in the DSL instead of the compiler. In other words, this interface should be exposed in the DSL at best. WDYT?

That's a bit problematic, because the ops are only created during the compilation.
I'll think a bit more about doing it the some other way, but this option seems to be a cheap way to tackle a practical issue.

Generally, my goal here is to move the non-portable and repeating things like attaching secrets out of the pipeline function so that it can be portable across clouds.

P.S. I'll remove the template modifiers as it's exposing Argo.

@Ark-kun
Copy link
Contributor Author

Ark-kun commented May 14, 2019

I've removed the template_transformers so now only op_transformers remain.

@Ark-kun
Copy link
Contributor Author

Ark-kun commented May 14, 2019

An example PR that would have been trivial with this: #749

It could have just been a simple switch:

transformers = []
if environment == 'gcp':
  transformers.add(gcp.use_gcp_secret('user-gcp-sa'))
else:
  transformers.add(onprem.mount_pvc(pvc_name, 'local-storage', pvc_mount_path))

@Ark-kun Ark-kun changed the title SDK/Compiler - Added op and template transformers SDK/Compiler - Added the ability to apply a function to all ops in a pipeline May 14, 2019
@Ark-kun
Copy link
Contributor Author

Ark-kun commented May 15, 2019

@gaoning777 Can you please take a look again?

@Ark-kun
Copy link
Contributor Author

Ark-kun commented May 17, 2019

Gentle ping
/cc @gaoning777

@jinchihe
Copy link
Member

@Ark-kun This is great PR. Just want to confirm, not sure if we need to consider the case that user uses the dsl-compile --py xxx.py --output xxx.yaml, the op_transformers cannot be set, seems we need to expose one option for the op_transformers in dsl-compile?

And consult one quick question here, you used the environment to switch gcp and onprem, where get the environment? I think it's set from function header of pipeline python dsl file, right? Personally I think we should have a better way to let use define the variable to switch user cases. Thanks a lot!

@gaoning777
Copy link
Contributor

gaoning777 commented May 20, 2019

I think the exposed interface should be in the DSL instead of the compiler. In other words, this interface should be exposed in the DSL at best. WDYT?

That's a bit problematic, because the ops are only created during the compilation.

How about passing the transformers to PipelineConf(

class PipelineConf():
)?
I'm more inclined to exposing this as DSL interface instead of compiler options.

I'll think a bit more about doing it the some other way, but this option seems to be a cheap way to tackle a practical issue.

Generally, my goal here is to move the non-portable and repeating things like attaching secrets out of the pipeline function so that it can be portable across clouds.

P.S. I'll remove the template modifiers as it's exposing Argo.

@Ark-kun
Copy link
Contributor Author

Ark-kun commented May 20, 2019

How about passing the transformers to PipelineConf

Hmm. I guess it will solve some of the problems. Thanks for the suggestion.

@Ark-kun
Copy link
Contributor Author

Ark-kun commented May 20, 2019

we need to consider the case that user uses the dsl-compile --py xxx.py --output xxx.yaml, the op_transformers cannot be set, seems we need to expose one option for the op_transformers in dsl-compile?

You're right. This is an important use case. (I keep forgetting about it since I just start the pipeline files directly instead of using the compiler).

And consult one quick question here, you used the environment to switch gcp and onprem, where get the environment?

I was thinking about a either having a constant variable in the pipeline program code or read the value from an environment variable.

@jinchihe
Copy link
Member

@Ark-kun one more question, do you planed to handle the case that if we want to apply the function to all steps except first one? For example, the first step is to create the pvc but following steps need to apply the pvc (function). Thanks a lot!

@gaoning777
Copy link
Contributor

For more customized scenarios, why not write a python for loop to do it. For example,
ops = []
for op in ops:
op.apply()
Then, you can choose any op in the ops list.

@Ark-kun
Copy link
Contributor Author

Ark-kun commented May 21, 2019

do you planed to handle the case that if we want to apply the function to all steps except first one?

The function can inspect the op and decide whether to modify it or not. For example it can skip the nodes with particular name or all resource nodes.

@jinchihe
Copy link
Member

For more customized scenarios, why not write a python for loop to do it. For example,
ops = []
for op in ops:
op.apply()
Then, you can choose any op in the ops list.

Yes, in fact I've already done that in the PR #749 :-)

@Ark-kun
Copy link
Contributor Author

Ark-kun commented May 22, 2019

How about passing the transformers to PipelineConf(

Done.
/cc @gaoning777

@gaoning777
Copy link
Contributor

Thanks for the change.
/lgtm

@Ark-kun
Copy link
Contributor Author

Ark-kun commented May 23, 2019

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ark-kun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 2a9bbdf into kubeflow:master May 23, 2019
magdalenakuhn17 pushed a commit to magdalenakuhn17/pipelines that referenced this pull request Oct 22, 2023
* Add explanation image

* Update AIX mnist readme with example explanation

* Add development image link

* Update readme based on code review

Co-authored-by: Tommy Li <Tommy.chaoping.li@ibm.com>

Co-authored-by: Tommy Li <Tommy.chaoping.li@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants