-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend the DSL to implement the design of #801 #926
Extend the DSL to implement the design of #801 #926
Conversation
Hi @elikatsis. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
1 similar comment
Hi @elikatsis. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
c6e3304
to
fc83d15
Compare
I force pushed to ensure commits are in proper chronological order. |
/ok-to-test |
0b03f0c
to
5fef914
Compare
Really thanks for submitting the PR. I will review it shortly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments on an alternative way to implement it by not specializing volume and snapshot in DSL and still just rely on pipeline param to pass data and dependency. I feel that this pattern can be applied to more generic k8s resources instead of just volume.
self.data_source_name = None | ||
self.mode = None | ||
self.deps = set() | ||
self.k8s_resource = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really think it should have a base class named K8sResourceOp in parallel with ContainerOp to create general k8s resource. Volume and snapshot are two specialized ones that are deriving from it. A lot of code here can be reused.
I think it's also reasonable to me to have a PR to refactor this in the future..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on having a K8ResourceOp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really think it should have a base class named K8sResourceOp in parallel with ContainerOp to create general k8s resource. Volume and snapshot are two specialized ones that are deriving from it. A lot of code here can be reused.
+1 on having a K8ResourceOp
Hello @hongye-sun, @vicaire, thanks for the review!
This is a great suggestion. I am amending the design doc, to include your comment and a suggestion on how we can do it. We have tested it with an initial implementation of the code and it works beautifully. Essentially, it gives the user a way to create a custom K8s resource, from a resource spec, and have attributes of the K8s resource it creates be PipelineParam
s for the rest of the pipeline.
Let's continue the discussion in the design doc, I will submit an extension shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new design looks awesome and I really like it. It aligns with our vision of DSL and provides an elegant way to orchestrate arbitrary k8s resources. Let's continue our discussion on the design doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, it's worth to take a look of #879. In that PR, @eterna2 followed the same pattern to make ContainerOp and Container (which inherits from V1Container) as 2 different things, so that:
- We are able to expose the full k8s container spec from ContainerOp
- SideCar can inherit from Container
- (in the future) We can create a TFJob which contains a Container. Then, all the convenient APIs that we support in Container like param parsing, volume mount, etc can be just reused.
That PR also implements a more generic way to extract PipelineParams from any object as long as it follows the convention from swagger generated code. This effort should make ResourceOp implementation a lot easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new design looks awesome and I really like it. It aligns with our vision of DSL and provides an elegant way to orchestrate arbitrary k8s resources. Let's continue our discussion on the design doc.
I think our visions for the DSL are aligned, it's just a matter of finding the most expressive/succinct way of describing our end goal. Your comments are very helpful in this regard. Looking forward to your comments on the design doc.
storage_class: The storage class to use for the dynamically created | ||
PVC (requires size or data_source) | ||
data_source: The name of an existing VolumeSnapshot or a | ||
PipelineVolumeSnapshot object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to make PipelineVolumeSnapshot Derive from VolumeSnapshot? Why do you need to differentiate them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Is it possible to not have a concept of volume that is specific to Pipeline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to make PipelineVolumeSnapshot Derive from VolumeSnapshot? Why do you need to differentiate them?
+1. Is it possible to not have a concept of volume that is specific to Pipeline?
I will be amending the design doc. Yes, we can have a volume be a descendant of a generic K8s volume resource, V1Volume()
. Similarly for the snapshot. The thing is approach does not cover is the ability to deduce dependencies between steps. I will continue in the design doc on #801.
size: The size of the PVC which will be created | ||
storage_class: The storage class to use for the dynamically created | ||
PVC (requires size or data_source) | ||
data_source: The name of an existing VolumeSnapshot or a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add that it is an alpha feature and requires to be enabled in k8s cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup! Will add a comment in the final PR.
samples/volumes/example5.py
Outdated
description="The fifth example of the design doc." | ||
) | ||
def example5(rok_url): | ||
vol1 = dsl.PipelineVolume( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class mingles volume spec and creating volume operation together, which is not ideal to me. Can we make them as separate classes, for example:
PipelineVolume is served as a data object which contains the volume spec.
PersistentVolumeClaimOp is similar with ContainerOp which serves as creating a pvc in the pipeline.
PipelineVolume can be an output of PersistentVolumeClaimOp.
Later, we might refactor the code that PipelineVolume derives from k8s volume spec, and PersistentVolumeClaimOp derives from K8sResourceOp which can be used to create general k8s resources.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class mingles volume spec and creating volume operation together, which is not ideal to me. Can we make them as separate classes, for example:
PipelineVolume is served as a data object which contains the volume spec.
PersistentVolumeClaimOp is similar with ContainerOp which serves as creating a pvc in the pipeline.
PipelineVolume can be an output of PersistentVolumeClaimOp.
Yes! This was a great comment.
We went through this with @elikatsis , and tried to iterate on this approach.
We have verified with actual code that this indeed improves the design even further.
Essentially what you are saying is:
Let's separate the data object, which refers to the data being passed from step to step, from the operation, the DAG step that creates the object.
This wil work. First have a generic way to create a K8s resource, similar to ContainerOp
, then inherit from it so we can have an Op
that creates PVCs, then have this Op
produce a data objct that is the PipelineVolume
, the volume itself, ready to be consumed by next steps.
Later, we might refactor the code that PipelineVolume derives from k8s volume spec, and PersistentVolumeClaimOp derives from K8sResourceOp which can be used to create general k8s resources.
WDYT?
This was a great idea, I'll be amending the design doc, and we have verified this works, with real code.
samples/volumes/example5.py
Outdated
command=["sh", "-c"], | ||
arguments=["cat /data/file*| gzip -c >/data/full.gz"], | ||
volumes={"/data": vol1} | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we change vol1 as an output of a PvcOp, we can embed the output PipelineParam of PvcOp in the volume spec. In the containerop, we can leverage the similar way as args and command to parse the volumes to get input params. In this way, PipelineVolume can be served as a pure data object and we can just rely on PipelineParam to manage the data passing and dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we change vol1 as an output of a PvcOp, we can embed the output PipelineParam of PvcOp in the volume spec. In the containerop, we can leverage the similar way as args and command to parse the volumes to get input params. In this way, PipelineVolume can be served as a pure data object and we can just rely on PipelineParam to manage the data passing and dependency.
Yes, this works. But has two disadvantanges, let's continue the discussion in #801:
- It requires modifications to Argo to allow
container
templates to consumevolumes
that contain input parameters that must be evaluated before each container task in the DAG - It cannot capture the case when there is multiple container tasks are using the same volume and we need to depend on all containers finishing, before another task can consume [mount] the volume.
samples/volumes/example5.py
Outdated
image="library/bash:4.4.23", | ||
command=["sh", "-c"], | ||
arguments=["cat /data/file*| gzip -c >/data/full.gz"], | ||
volumes={"/data": vol1} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to me to support volumes in ContainerOp's constructor, but it's better to make it support a generic k8s volume spec here. PipelineVolume still have value to exist to hide the complexity of k8s sdk code and it may keep additional data other than k8s Volume like subPath to facilitate mounting inputs and output folders from a single volume.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to me to support volumes in ContainerOp's constructor, but it's better to make it support a generic k8s volume spec here. PipelineVolume still have value to exist to hide the complexity of k8s sdk code and it may keep additional data other than k8s Volume like subPath to facilitate mounting inputs and output folders from a single volume.
ACK. We must simplify the way ContainerOp
instsances can consume arbitrary V1Volume
instances. If we make PipelineVolume
a descendant of V1Volume
then things will work automagically, whether we create k8s.V1Volume
instances manually, or we depend on a previous task in the pipeline emitting a PipelineVolume
instance.
samples/volumes/example2.py
Outdated
image="library/bash:4.4.23", | ||
command=["sh", "-c"], | ||
arguments=["cat /mnt/file1 /mnt/file2"], | ||
volumes={"/mnt": vol_common.after(step1, step2)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this code is really needed. Users achieve same effect by: step3.after(step1, step2).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this code is really needed. Users achieve same effect by: step3.after(step1, step2).
This is the only comment we don't really agree with.
Apart from being a bit unfortunately named -- I'll open a separate issue for this -- I understand step3.after()
is meant to be a last resort; as a way to declare dependencies when there is no other way for the compiler to deduce them.
But what is great about using PipelineParam
instances is that the compiler can detect dependencies based on how we use PipelineParam
instances, i.e., when we format strings based on them.
Similarly, this PR aims to make volumes a first-class way of passing GBs of objects between steps, thus creating dependencies implicitly.
We argue that PipelineParam
instances is the only way to deduce dependencies now, we will soon have full artifacts [and we are happy to contribute in their implementation!], similarly volumes should be such a way. So, yes, a PipelineVolume
is a V1Volume
, agreed, but it also brings in dependencies when used.
Let's talk more about this after I amend #801.
samples/volumes/example5.py
Outdated
volumes={"/data": vol2} | ||
) | ||
|
||
step2_snap = dsl.PipelineVolumeSnapshot( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this feature is still in Alpha, I am worried that not every user can use it. Can we not specialize it in DSL for now and just use k8s spec to represent it and we provide a K8sResourceOp to support to create it in DSL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this feature is still in Alpha, I am worried that not every user can use it. Can we not specialize it in DSL for now and just use k8s spec to represent it and we provide a K8sResourceOp to support to create it in DSL?
We agree with creating a special Op
to create resources.
But using the k8s resource manually does not work, it leads to lots of boilerplate.
Using a simple type in the DSL that inherits from this resource simplifies the pipelines significantly.
This is also what has gathered very positive end user feedback, both in the design doc issue #801, and in discussions we've had with potential users in the wild. So, I think it is best to keep it. We will add a warning that this will not work unless supported by the cluster.
@@ -1,5 +1,5 @@ | |||
/* | |||
* Copyright 2018 Google LLC | |||
* Copyright 2018-2019 Google LLC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more overall comment: Is it possible to implement something that would work even if a pipeline step is used multiple times? In this case, one execution should not override the data written by another.
For example, Ning is planning to add loop support: https://docs.google.com/document/d/12KHoEGe3o-i2WyzaU2JPXp3GQL3_BavGuh-KYBMUdQ8/edit?ts=5c5b2627#heading=h.bhvs46afvzxg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more overall comment: Is it possible to implement something that would work even if a pipeline step is used multiple times? In this case, one execution should not override the data written by another.
For example, Ning is planning to add loop support: https://docs.google.com/document/d/12KHoEGe3o-i2WyzaU2JPXp3GQL3_BavGuh-KYBMUdQ8/edit?ts=5c5b2627#heading=h.bhvs46afvzxg
I will amend the design doc in #801 so we use PipelineParam
instances in the K8s resource specs referred to by PipelineVolume
and PipelineVolumeSnapshot
instances. This should make them usable in loops.
We don't have access to an implementation of loops yet, so we cannot verify our code, but we are confident this will work, based on our use of PipelineParam
instances. Let's talk more about it in #801, if you feel there is something that can cause problems with your implementation.
5fef914
to
63ce0c1
Compare
Hello! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your contributions @vkoukis. Here are a couple initial comments.
Since this is such a big change, I am asking @rileyjbauer to review the frontend part, and @hongye-sun to approve the DSL part (@hongye-sun will be back next week).
step2_snap = dsl.VolumeSnapshotOp( | ||
name="create_snapshot_2", | ||
resource_name="snap2", | ||
volume=step2.volume |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this sample focus on having one volume per output? Could we cover the case where the same volume is reused throughout the workflow, but a different directory within the volume is used for each output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this sample focus on having one volume per output? Could we cover the case where the same volume is reused throughout the workflow, but a different directory within the volume is used for each output?
We have modified the resourceops/volume_snapshotop_sequential.py
sample, which covers exactly your desired case: One volume, being used by all steps, each step storing output in a different directory.
samples/resourceops/volumeop_dag.py
Outdated
image="library/bash:4.4.23", | ||
command=["sh", "-c"], | ||
arguments=["cat /mnt/file1 /mnt/file2"], | ||
volumes={"/mnt": vop.volume.after(step1, step2)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This "vop.volume.after" statement should not be needed if we use Argo's parameter passing functionality. Is there a way we could avoid exposing this feature and focus instead on using Argo's parameter passing functionality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This "vop.volume.after" statement should not be needed if we use Argo's parameter passing functionality. Is there a way we could avoid exposing this feature and focus instead on using Argo's parameter passing functionality?
We already use Argo's parameter passing functionality, for resource names in general, and volume and snapshot names in particular.
We already use the existing mechanism that the compiler has to deduce dependencies based on PipelineParam
s: Because the names of resources created by ResourceOp
s are output PipelineParam
s, any ContainerOp
that needs to mount a volume will already take a dependency on the ResourceOp
that is supposed to create it.
This mechanism however does not suffice. The problem is how to explain to the compiler that three ContainerOp
s have to use the same volume sequentially, because one step needs to access files that the previous step must have placed in the volume. In this case, there could be no PipelineParam
being passed from step to step.
This is our rationale: a dsl.PipelineVolume
is a K8s Volume that brings dependencies, exactly as a PipelineParam
is a string that brings dependencies. Similarly, vop.volume.after(step1, step2, ...)
is not a statement, it is an expression that returns a new dsl.PipelineVolume
that has taken a dependency on step1
, step2
.
KFP only supports PipelineParam
s for deducing dependencies now. But in a while, we will also have artifacts. We think the platform should afford volumes the same first-class citizen treatment.
This avoids cumbersome workarounds using dummy PipelineParams
like the following:
vop = dsl.VolumeOp(...)
step1 = dsl.ContainerOp(
name="step_1",
image="some_image_1",
command=["sh", "-c"],
arguments=["echo 1 > /mnt/file1 && "
"some_processing_that_writes_to_the_volume"],
file_outputs={"dummy": "/mnt/file1"},
volumes={"/mnt": vop.volume}
)
step2 = dsl.ContainerOp(
name="step_2",
image="some_image_2",
command=["sh", "-c"],
arguments=["echo %s && some_processing_that_reads_from_volume"
% step1.output],
volumes={"/mnt": vop.volume}
)
In that case, these Argo parameters (KFP's PipelineParam
s) would have to be dummy data.
That, however, slows down the pipeline's execution, because Argo's executor has to retrieve and save these parameters.
It also requires the user to write additional unneeded code, just for that parameter passing.
I.e.:
- The user has to create a file, let's say
/tmp/dummy
- Point to it in
file_outputs
- Use it in another step's
command
/argument
(e.g."echo %s" % previous_step.output
. This will also be logged, but it will have no value).
All of the above led us to provide such feature. That is to drive dependencies from PipelineVolume
s (wrapper of V1Volume
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @vkoukis. I am a bit hesitant about having multiple ways to specify dependencies between pipeline steps depending on how the data is stored (volume or not).
Would it be possible to only define dependencies between steps in on way? What about a way to enforce that step_2 should execute after step_1 that does not involve volumes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, there are already two ways of specifying dependencies between pipeline steps. One is using the after()
method explicitly, while the other way is to use a PipelineParam
in any field of an Op
.
Similarly, I assume that when KFP has support for Artifacts, the compiler will derive dependencies based on their usage.
This is what makes PipelineParam
s powerful. I fail to see why the platform cannot afford volumes the same treatment. By not doing so, we are essentially limiting the user experience severely: If a user works with PipelineParam
s, they can write nice, succinct code, the proposed changes carry this functionality so it works when using volumes for data exchange. We have shown in the examples, that the same kind of code can be written with volumes being the medium for passing information from step to step.
Essentially, PipelineVolume
s should be of the same object "family" in the DSL, objects that the compiler uses to deduce dependencies.
name="step2_gunzip", | ||
image="library/bash:4.4.23", | ||
command=["gunzip", "-k", "/data/full.gz"], | ||
volumes={"/data": vop2.volume} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that these samples won't work with loops/recursion or whenever a step is reused multiple times. The same volume will be reused and the data overridden.
Here are some examples/documentations for loop/recursion on which to validate the volume implementation:
https://www.kubeflow.org/docs/pipelines/sdk/dsl-recursion/
https://github.com/kubeflow/pipelines/blob/master/samples/basic/recursion.py
https://github.com/kubeflow/pipelines/blob/master/sdk/python/tests/compiler/testdata/recursive_while.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that these samples won't work with loops/recursion or whenever a step is reused
multiple times. The same volume will be reused and the data overridden.
Actually, what is going to happen depends on the user. If the user wishes to use a single volume throughout the loop, for all of its iterations, then they should create it before the loop. Similarly, if they have explicitly placed a ResourceOp inside a loop, then they expect multiple resources to be created, one for each iteration, so, they need to set the name of the resource accordingly, e.g., based on a PipelineParam value. This will ensure that each loop iteration creates a differently-named resource. Our mechanism supports all of this, it's up to the user to decide their policy.
Here is another example: If the resource has meaning specific to an iteration, e.g. for debugging, then the user should specify a PipelineParam
as part of the resource_name
argument.
Similarly, the user can choose what volume to mount on a Container, based on the value of flip_result
in your recursive_while.py
example.
Another way is for the user to provide a k8s_resource
directly, and fill the metadata.generateName
, to allow K8s to determine the final name. No matter what they choose, they will be able to use the final name as an output PipelineParam
of this ResourceOp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @vkoukis. The current thinking was that we store data in a directory as follows:
/"subdirectory"/"argo-workflow-UUID"/"POD-UUID"/...
This avoids collision if the same step is executed multiple times (loop, recursion, reuse of a template).
To make it easy to get the and the , we use special Argo strings that get substituted by Argo as the workflow executes: {{{workflow.uid}}}, {{{pod.name}}}
https://github.com/argoproj/argo/blob/master/docs/variables.md
Would it be possible for the DSL to automatically use these strings substituted by Argo to avoid the possibility of a collision?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow here.
Accessing the volume happens at the filesystem layer.
The task that runs inside the ContainerOp can [and should!] be able to write anywhere in the volume.
Exactly how to keep the results of different executions separate, is actually user-defined policy.
Some users may be creating a new volume for each run, this is actually what we suggest our users do, because it is super easy and efficient to create new volumes/clones. Others, may choose to work under a specific subdirectory, in which case they can always refer to {{workflow.uid}}
when accessing the volume. A subpath mount may come useful in this case.
Essentially, where the data lives in the volume is user-defined. And they can always use PipelineParam
s to pass extra information e.g., on the specific directories where they have placed results, from step to step.
This is also important so we can seed a pipeline from notebook data. It doesn't make sense to impose a specific /{{workflow.uid}}
prefix for every single access, since the data may also have been created outside a pipeline context, from within a notebook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Just reviewing the changed under /frontend
)
This is really great! Thank you for being so thorough!
const child = templates.get(task.template); | ||
if (child) { | ||
if (child.nodeType === 'dag') { | ||
buildDag(graph, task.template, templates, alreadyVisited, nodeId); | ||
} else if (child.nodeType === 'container' ) { | ||
} else if (child.nodeType === 'container' || child.nodeType === 'resource') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove extra space at beginning of line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rileyjbauer, thank you for taking the time to review the PR!
Sorry for that, I fixed it 🙂
nodeColor = 'cornsilk'; | ||
} else if (child && child.nodeType === 'resource') { | ||
const isSnapshotIf = /apiVersion: snapshot.storage.k8s.io/; | ||
const isPVCIf = /apiVersion: v1\nkind: PersistentVolumeClaim/; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(most likely moot due to above comment)
Is there any danger in changing these two regex to the following?
const isSnapshotIf = /kind: VolumeSnapshot/;
const isPVCIf = /kind: PersistentVolumeClaim/;
If they do need to remain as they are, and assuming the manifest will be like the example provided in StaticGraphParserTest
, this one should be:
/apiVersion: v1\\nkind: PersistentVolumeClaim/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually /kind: <kind>/
may also appear in other fields of a manifest.
E.g. A PVC created from a VolumeSnapshot has a field which contains /kind: VolumeSnapshot/
and vice versa.
So we will might need such regex when we refactor the UI as you mentioned in another comment.
For now I had them removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see! Thanks for explaining
frontend/src/lib/WorkflowParser.ts
Outdated
let nodeColor; | ||
const template = templates.get(node.templateName); | ||
if (template && template.resource) { | ||
const isSnapshotIf = /apiVersion: snapshot.storage.k8s.io/; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Also likely moot)
Same comment as in StaticWorkflowParser
, any problem with changing to just use kind
?
const isSnapshotIf = /kind: VolumeSnapshot/;
const isPVCIf = /kind: PersistentVolumeClaim/;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See here.
frontend/src/lib/WorkflowParser.ts
Outdated
} | ||
|
||
const node = workflow.status.nodes[nodeId]; | ||
let tmpl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this for-loop can be replaced with
tmpl = workflow.spec.templates.find(t => !!t && !!t.name && t.name === node.templateName);
Same comment below in getNodeManifest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one! Thank you!
frontend/src/lib/WorkflowParser.ts
Outdated
// Makes sure the workflow object contains the node and returns its | ||
// volume mounts if any. | ||
public static getNodeVolumeMounts(workflow: Workflow, nodeId: string): string[][] { | ||
type volumeInfo = string[][]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
This type isn't really needed. The same is true for the paramList
above, but I thought it would just make the line
const inputsOutputs: [paramList, paramList] = [[], []];
a little nicer compared to
const inputsOutputs: [string[][], string[][]] = [[], []];
though I am open to disagreement about that, and removing the paramList
type as well.
(same goes for the manifestInfo
type below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I had it for uniformity, but I got it simplified.
let nodeColor; | ||
if (task.when) { | ||
nodeColor = 'cornsilk'; | ||
} else if (child && child.nodeType === 'resource') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to ask, but do you mind reverting the changes around coloring in the StaticGraphParser
and WorkflowParser
?
I appreciate the work here, but the topic of colors in these graphs is still a bit contentious (we may end up undoing the conditional/cornsilk coloring as well), and in the runtime graph especially, this can lead to confusing messaging in extreme cases, e.g.:
I still think having some sort of visual cue that these nodes interact with the volumes would be beneficial, perhaps some icons + tooltips? but that discussion can/should happen after this PR gets merged, and you certainly needn't be the ones to do that later work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed all the coloring.
We would be happy to join the future discussion on the UI enrichment, and of course contribute to it!
Thanks for the screenshots! |
Here is the PR to Arrikto's branch arrikto#2. It just moves some code back where it was so that the diff is ~1200+ lines smaller. @elikatsis or @vkoukis only need one click on GitHub to merge it. |
We'd like to merge this PR by the end of tomorrow. I want to mention that I'm not blocking this PR. I'd like arrikto#2 to be incorporated, but we'll merge this PR tomorrow regardless. |
This way the diff is much smaller and more understandable. We can always split or refactor the file later. Refactorings should not be mixed with genuine changes.
I see that all the tests have succeeded. @Ark-kun thank you once again for your proposal. It is totally reasonable and I merged it. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: vicaire The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1 similar comment
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: vicaire The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thanks a lot for your contribution @elikatsis and @vkoukis, and apologies for the delays. |
Thank you, @vicaire , @hongye-sun , @rileyjbauer , @Ark-kun for the insightful comments, reviews, and help! We're really happy to see this PR being merged, and are looking forward to making follow-up PRs. |
Thanks @vkoukis and @elikatsis for the high quality PR. Really glad that it goes through finally. |
* update-pipeline * update-pipeline-image * update-pipeline-image
Hello,
This is a PR to implement all the features described in #801.
PipelineVolume
s andPipelineVolumeSnapshot
s. These are an abstraction for the user to manipulatePVC
s andVolumeSnapshot
s for data management purposes, also exposing dependencies according to their usageContainerOp
's constructor to accept one more argument:volumes
. It is a dict having filesystem paths as keys andPipelineVolume
s orPipelineVolumeSnapshot
s as values. We rename the pre-existing internal attributevolumes
tok8s_volumes
, to avoid misinterpretation.ContainerOp
,PipelineVolume
,PipelineVolumeSnapshot
) using the existingops
dictionary. However, we also refer to objects separately based on their type, grouping objects for type-specific processing;cops
,vols
,snaps
. Previously,cops
was the only type of object inops
.PipelineVolume
s andPipelineVolumeSnapshot
s tasks differently:PipelineVolume
tasks which create newPVC
s appear in a different color (mistyrose
)PipelineVolumeSnapshot
tasks which create newVolumeSnapshot
s appear in a different color (honeydew
)Volume Mounts
section in the UI which lists all mount points in aContainerOp
along with their mounted volumes.resource
templates we show the full manifest, which is to be submitted to the clustersamples/volumes/
. These examples correspond to the design of Extend the DSL with support for Persistent Volumes and Snapshots #801sdk/python/tests/dsl/
andsdk/python/tests/compiler
I only recently started learning about the frontend, so looking forward to your feedback on my tests 🙂
Related issues
Closes #801
Make sure you have applied the fixes of argoproj/argo-workflows#1238, kubeflow/kubeflow#2556. This PR also depends on a version of Argo with the fix already merged at argoproj/argo-workflows#1232. I have uploaded updated Argo images at
gcr.io/arrikto/argoexec:latest-elikatsis
,gcr.io/arrikto/workflow-controller:latest-elikatsis
to help with the review process.This change is