-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(backend): workflow validation. Fixes #3526. #3965
feat(backend): workflow validation. Fixes #3526. #3965
Conversation
Thanks for this work! BTW, the SDK performs the validation if working Th only issue I see is that we need to be able to update the backend Argo module when updating the manifests. |
I am rather new to Golang (learning through opensource) and dont now how to fix this issue with currently breaks the tests.
I have added the dependency to the go.mod but it dont seem to help ... :( Is there anyone that could give some advice on how to solve this/where I can read up to solve it. @IronPan, @mgogogo, @Ark-kun Thanks for the help! |
friendly ping @rmgogogo |
You may ask @jingzhang36 or @IronPan. |
I'm general OK with this PR. @jingzhang36 leave this to your LGTM |
How you update go.mod? Is it via manual? Here is the suggested steps: https://github.com/kubeflow/pipelines/tree/master/backend#updating-build-files It's true that we didn't update it for a long time. |
@NikeNano could you add the new dependency on github.com/argoproj/argo/workflow/validate to BUILD.bazel in this directory? |
}) | ||
|
||
func setMetadata(wf *v1alpha1.Workflow) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: addRuntimeWorkflowMetadata? to make tests easier to understand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably we can omit workflow, addRuntimeMetadata? WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can set it directly on the test workflow and drop the method to clean up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @NikeNano, note the current change is against the goal we want to test. In real world, the workflows passed from KFP compiler do not have these metadata annotations. Therefore, always adding them beforehand makes tests deviate from the real world.
I understand we all want tests to be simpler, so they can be more readable. I think a better way to do this is to stop comparing all fields of a workflow in tests. Instead, we can reset fields we do not care about in some tests to empty before comparing with expected workflow. e.g. for this case, we can simply set annotations and labels to empty, and do not include them in the expected workflow.
Note that, for tests that specifically verify labels & annotations are correct, we still keep them to test what we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Therefore, always adding them beforehand makes tests deviate from the real world.
Yes good point, will have to find another solution.
I see your point, the tricky part here is that the WorkflowRuntimeManifest
is handled as a string. Which means we need to unmarshal it to a *v1alpha1.Workflow
again, remove the metadata and convert to a string again and then set it on the runDetails
. I think this adds more complexity than just adding it atm.
@@ -208,6 +219,19 @@ func createPipeline(name string) *model.Pipeline { | |||
}} | |||
} | |||
|
|||
func createRunExpectedWorkflow(params []v1alpha1.Parameter, labels map[string]string, annotations map[string]string, serviceAccount string) *v1alpha1.Workflow { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this is better than the original addMetadata
version? I feel like that's simpler, because it sticks to workflow struct in interface.
I only had concerns regarding naming. Current implementation is good in a few minor points too, so overall I think the following are good:
- The method returns a new workflow via deepcopy (so that the method does not mutate its inputs)
- The method name makes tests readable -- we can tell it adds some metadata that is present on runtime workflows.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel sorry you had to change the entire test file multiple times, I think we can save some intermediate efforts by commenting ideas and discuss in comments, then finish the code when we agree with each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rewrote to use createRunExpectedWorkflow
since all the tests related to CreateRun
have the same similar set up required where the metadata
is a subset only. I don't have any strong feelings for either of the solutions, but maybe bringing back the addMetadata
is easiest and then have a separate look on the tests to improve the readability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problem with the current createRunExpectedWorkflow
interface is that, there are a fixed number of arguments.
Whenever any test calls it, it has to supply all the arguments. However, if we will need to add any new arguments in the future, we'll have to update every test that calls createRunExpectedWorkflow
.
Also, it creates another layer of abstraction over directly creating workflow structs, but the new abstraction doesn't really provide much value, it only adds metadata by default.
Compared to this, the original addMetadata
is a lot simpler. I was only asking for a more descriptive name.
Therefore, my proposal would be:
func withCreateRunMetadata(*v1alpha1.Workflow workflow) *v1alpha1.Workflow {
workflow.DeepCopy()
...
(you can adjust the name if you have better ideas)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NikeNano see above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, see your point that it is a simpler interface, will update.
5c89a12
to
3210e7f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the update!
/lgtm
There's only one minor comment left.
@@ -208,6 +219,14 @@ func createPipeline(name string) *model.Pipeline { | |||
}} | |||
} | |||
|
|||
func withCreateRunMetadata(wf *v1alpha1.Workflow) *v1alpha1.Workflow { | |||
template := wf.Spec.Templates[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: deepCopy the workflow before changing it?
You are now editing the input wf
.
Another option is to not add the return value, and make it explicit this method changes input.
It's up-to-you to decide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went back to the addRuntimeMetadata
and add the metadata to the input workflow, the circle is complete :)
383e14e
to
f246e9f
Compare
template.Metadata.Annotations = map[string]string{"sidecar.istio.io/inject": "false"} | ||
template.Metadata.Labels = map[string]string{"pipelines.kubeflow.org/cache_enabled": "true"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asking before I change too much again. In order to reuse the addRuntimeMetadata
it needs to live outside of the test file(or be duplicated). What is best practice in this case? Should we just add test_util.go
in order to make it exported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, we can add a test_util.go
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Bobgy The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Currently pipelines will not be validated until run, this result in that it takes longer time to test and upload pipelines. This PR(current draft) aims to fix this and resolved #3526.
The goal is to validate pipelines when uploaded and then return the errors.