Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pipeline version to job/run integration test so that job/run is c… #3270

Merged
merged 5 commits into from
Mar 16, 2020

Conversation

jingzhang36
Copy link
Contributor

@jingzhang36 jingzhang36 commented Mar 13, 2020

…reated from pipeline version

Related issue: #3247


This change is Reviewable

@jingzhang36
Copy link
Contributor Author

/assign @Bobgy
/cc @rmgogogo

@jingzhang36
Copy link
Contributor Author

/test kubeflow-pipeline-e2e-tes

Copy link
Contributor

@Bobgy Bobgy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small comment, otherwise looks good to me

ResourceReferences: []*job_model.APIResourceReference{
{Key: &job_model.APIResourceKey{Type: job_model.APIResourceTypeEXPERIMENT, ID: helloWorldExperiment.ID},
Relationship: job_model.APIRelationshipOWNER},
{Key: &job_model.APIResourceKey{Type: job_model.APIResourceTypePIPELINEVERSION, ID: helloWorldPipelineVersion.ID},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion:
Shall we keep existing tests and add a new one testing using pipeline version?
So that we can keep test coverage for using pipelines in Job.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, we are going to deprecate the usage of pipeline id to create run/job. And all runs and jobs must be from some version. The to-be-deprecated approach to creating from pipeline is actually using its default version as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.
I'm curious why we are deprecating that usage, I thought we are going to be backward-compatible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a little bit of history. Once upon a time, we are trying to follow the practice of CMLE api. In that approach, a pipeline is allowed to contain no pipeline versions at all. Then we figure it might be more flexible and more extensible if we always go from versions. No matter we allow pipeline to have empty versions or not, creating from versions works. And moreover, we always have a default version, so from the UI, user open a pipeline, its details page actually shows its default version and when run is created from there, we naturally use default version to create run. And blahblah... and also, if we really really really just want pipeline, it should be specified in resource reference field instead of the pipeline spec field, since that resource reference field is meant and added to have a uniform way to specify dependencies between our objects/resources...

So in short, specifying pipeline in the pipeline spec is going to be deprecated for actually two reasons: (1) we want creation from pipeline versions (2) we want to use resource reference field.

Copy link
Contributor

@Bobgy Bobgy Mar 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so it will be supported to specify pipeline id from resource references? (is it already supported?)
So it's still a breaking change, but we have a migration path, is that right?

Do you have a tracking list for TODO items like updating SDK to not use the spec field?
e.g. here

spec = kfp_server_api.models.ApiPipelineSpec(
pipeline_id=pipeline_id,
workflow_manifest=pipeline_json_string,
parameters=api_params)
run_body = kfp_server_api.models.ApiRun(
pipeline_spec=spec, resource_references=resource_references, name=job_name)

Also, the pipeline spec field also accepts json string of the entire pipeline manifest to run. I think that usage is required for cloning runs. Will we deprecate that?
Looks like you have migrated all UI usages of pipeline_spec with pipeline_id, SGTM.

Copy link
Contributor Author

@jingzhang36 jingzhang36 Mar 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the SDK, two cases:
Our own script can be fixed by us
User script will get warning that pipeline id in pipeline spec.

Another important reason why we prefer to refer to version instead of pipeline is that default version can change. When user specify a pipeline, we use the default version since all real manifests are conceptually linked to versions. Later the default version changes, and if GetRun only give back the pipeline id used to create the run, user wouldn't know which version was used at the creation time and they'll have to rely the manifest associated with the run at creation time. It is ok but messy.

So even when we allow the user still uses pipeline id in pipeline spec, internally we'll still record the run with a reference to the default version at the creation time. Hence implement the TODO at

// TODO(jingzhang36): after FE is enabled to use pipeline version to create

Copy link
Contributor Author

@jingzhang36 jingzhang36 Mar 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, an opportunity for us to totally remove all legacies is to introduce breaking changes (that is, surface the internal change to the sdk client interface) at KFP 1.0......

Copy link
Contributor

@Bobgy Bobgy Mar 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree on these, just want to double check if you have put the following in your plan like:

  • adding deprecation notification in sdk methods that we will no longer support (I believe we should do this as early as possible, it's not far away from KFP 1.0's final release.)
  • confirm with the team that we will have a breaking change (I'm not sure if we all agreed in last design review, at least I didn't know we will make breaking change.)

So even when we allow the user still uses pipeline id in pipeline spec, internally we'll still record the run with a reference to the default version at the creation time. Hence implement the TODO at

For this part, it's totally reasonable, but it doesn't convince me we need to stop people from using pipeline id to create runs. Your mentioned recording pipeline version id at run creation time sounds like the right fix (and is not hacky at all).

Copy link
Contributor Author

@jingzhang36 jingzhang36 Mar 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things:

(1) the breaking change is optional, we can choose not to. That's my original plan (not to make the sdk client interface change). When we keep the SDK client interface the same, we only have a different implementation underhood (use resource reference instead of pipeline id in spec).

(2) the reason I talked about breaking change is IF we want it, we can leverage KF 1.0 as a proper time to do. Alternatively, as I suggested before, we put warning message without committing any change for a period of time. And the actual commit of the breaking change can wait until 2.0 or never....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed offline: we will fix sdk to workaround the breaking change

@jingzhang36
Copy link
Contributor Author

/test kubeflow-pipeline-e2e-test

@jingzhang36
Copy link
Contributor Author

/retest

@jingzhang36
Copy link
Contributor Author

/test kubeflow-pipeline-e2e-test

@jingzhang36
Copy link
Contributor Author

/test kubeflow-pipeline-sample-test

@jingzhang36
Copy link
Contributor Author

/test kubeflow-pipeline-e2e-test

@jingzhang36
Copy link
Contributor Author

/test kubeflow-pipeline-e2e-test

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign ironpan
You can assign the PR to them by writing /assign @ironpan in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

@Bobgy Bobgy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Name: pipelineVersionName, Relationship: job_model.APIRelationshipCREATOR},
}
assert.Len(t, job.ResourceReferences, 2)
assert.Subset(t, job.ResourceReferences, resourceReferences)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a little searching, is this a better choice: https://godoc.org/github.com/stretchr/testify/assert#Assertions.ElementsMatch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@k8s-ci-robot k8s-ci-robot removed the lgtm label Mar 16, 2020
@Bobgy
Copy link
Contributor

Bobgy commented Mar 16, 2020

/lgtm

@@ -218,6 +218,7 @@ github.com/spf13/pflag v1.0.3 h1:zPAT6CGy6wXeQ7NtTnaTerfKOsV6V6F8agHXFiazDkg=
github.com/spf13/pflag v1.0.3/go.mod h1:DYY7MBk1bdzusC3SYhjObp+wFpr4gzcvqqNjLnInEg4=
github.com/spf13/viper v1.3.0 h1:cO6QlTTeK9RQDhFAbGLV5e3fHXbRpin/Gi8qfL4rdLk=
github.com/spf13/viper v1.3.0/go.mod h1:ZiWeW+zYFKm7srdB9IoDzzZXaJaI5eL9QjNiN/DMA2s=
github.com/stretchr/objx v0.1.0 h1:4G4v2dO3VZwixGIRoQ5Lfboy6nUhCyYzaqnIAPPhYs4=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, why do you need this change? Now this PR requires root OWNERS to approve.

Copy link
Contributor Author

@jingzhang36 jingzhang36 Mar 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems getting in after using subset or elementsmatch. Not manually added. But it seems that I can merge, so I'll just leave it there.

@jingzhang36 jingzhang36 merged commit d9bb1be into kubeflow:master Mar 16, 2020
Jeffwan pushed a commit to Jeffwan/pipelines that referenced this pull request Dec 9, 2020
kubeflow#3270)

* Add pipeline version to job/run integration test so that job/run is created from pipeline version

* Use set comparison to check array equation in test

* address comments to use elements match
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants