fix(sdk): Allow non-pythonic names for graph components' task's outputs. Fixes #4514. #4515

Udiknedormin · 2020-09-18T15:17:53Z

Pythonic-to-original output name mapping got a default value equal to the output's name. Solves #4514, this is the suggested solution described there. Also adds tests to check if the problem was resolved.

It seems it could be cherry-picked in the current release branch.

kubeflow-bot · 2020-09-18T15:17:59Z

This change is

k8s-ci-robot · 2020-09-18T15:18:05Z

Hi @Udiknedormin. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Tomcli · 2020-09-18T16:12:06Z

/ok-to-test

Tomcli · 2020-09-18T16:27:20Z

/test kubeflow-pipelines-tfx-python36

Tomcli · 2020-09-18T16:41:15Z

It looks like the TFX test failed due to an older version of protobuf.
https://stackoverflow.com/questions/61922334/how-to-solve-attributeerror-module-google-protobuf-descriptor-has-no-attribu

numerology · 2020-09-18T17:49:31Z

Sent #4516 to fix.

Sorry for the inconvenience.

Tomcli · 2020-09-18T17:57:24Z

Thanks @numerology
@Udiknedormin can you rebase the PR once @numerology merged the fix? Thanks.

Ark-kun · 2020-09-18T23:33:03Z

sdk/python/kfp/components/_components.py

+        task_outputs_with_original_names = {
+            # component_bridge generates outputs under both pythonic and original name,
+            # so half of them are absent from pythonic_output_name_to_original
+            pythonic_output_name_to_original.get(pythonic_output_name, pythonic_output_name): output_value
+            for pythonic_output_name, output_value in task_outputs_with_pythonic_names.items()
+        }


I think this can be simpler:

Suggested change

task_outputs_with_original_names = {

# component_bridge generates outputs under both pythonic and original name,

# so half of them are absent from pythonic_output_name_to_original

pythonic_output_name_to_original.get(pythonic_output_name, pythonic_output_name): output_value

for pythonic_output_name, output_value in task_outputs_with_pythonic_names.items()

}

task_outputs_with_original_names = {output.name: task_obj.outputs[output.name] for output in task_component_spec.outputs or []}

output_name_to_pythonic, pythonic_output_name_to_original and task_outputs_with_pythonic_names are not needed then.

@Ark-kun
Was it ever a part of the contract for _container_task_constructor? It's not backwards-compatible, but it certainly would simplify things.

I wasn't sure about it, considering that _container_task_constructor can be overwritten and I only know of two overwrites:

_components._create_task_spec_from_component_and_arguments --- which uses original outputs' names, as seen in TaskSpec._init_outputs

_component_bridghe._create_container_op_from_component_and_arguments --- which uses both, as shown in the issue

Would the following be ok?

KT = TypeVar('KT') # could be imported from typing, of course VT = TypeVar('VT') def get_maybe_remap(m: Mapping[KT, VT], key: KT, key_mapper: Callable[[KT], KT]) -> VT: if key in m: return m[key] else: return m[key_mapper(key)] (...) task_outputs_with_original_names = { output.name: get_maybe_pythonic(task_obj.outputs, output.name, output_name_to_pythonic.__getitem__) for output in task_component_spec.outputs or [] }

It would remove the need for pythonic_output_name_to_original (i.e. reverse-mapping of output_name_to_pythonic) and task_outputs_with_pythonic_names, leaving just output_name_to_pythonic "just in case", for backward-compatibility. It's also reusable.

Was it ever a part of the contract for _container_task_constructor? It's not backwards-compatible, but it certainly would simplify things.

I think this should be the contract (original output names should be the keys in the task.outputs dictionary).
I think at his point it's still OK it make this change. The contract is private and I do not think anyone else has implemented the contact yet (also we're making a significantly more drastic change to it very soon).
We can always restore the behavior is we encounter unfixable case in the future.

@Ark-kun Ok, if it can be changed, then I'll simplify it and add the explicit contract.

Ark-kun · 2020-09-20T06:26:43Z

/retest

Loading container component from component.yaml creates both pythonic and original output names. Graph component iterated over all outputs, using pythonic-to-output conversion on all. If some of the names are not identical to their pythonic versions, they rised KeyError on the lookup table. This commit fixes this problem by using default value for the lookup.

Udiknedormin · 2020-09-21T12:19:38Z

@Tomcli

can you rebase the PR once @numerology merged the fix?

Done.

Ark-kun · 2020-09-21T20:29:19Z

/retest

Udiknedormin · 2020-09-22T17:59:40Z

@Ark-kun I added the contract and Protocol for both the ResolvedTask and ContainerTaskConstructor. I think it should be clear enough now. As a bonus, it should provide a little better user experience for type-checker users.

Ark-kun · 2020-09-25T07:43:26Z

I added the contract and Protocol for both the ResolvedTask and ContainerTaskConstructor. I think it should be clear enough now.

I really appreciate you Protocol proposal, but is it possible to stage this work into two PRs? To first just fix the bug and add the tests and then do the refactoring in another PR.
Protocols are interesting, but we need to get them right. For example the Task object interface is only expected to have outputs map. There are no more expectation at this moment.
There also is an ongoing refactoring in this place which will change the ContainerTaskConstructor protocol.

Udiknedormin · 2020-09-25T13:17:49Z

@Ark-kun Fine by me. Reverted to the minimal version.

Ark-kun · 2020-09-29T03:09:38Z

/lgtm
/approve

JFYI, the refactorings that I've mentioned are #3447, #3448 . I'll keep the idea of protocols in mind. Your previous code is available here: bef604c

k8s-ci-robot · 2020-09-29T03:09:48Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ark-kun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~sdk/OWNERS~~ [Ark-kun]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…ts. Fixes #4514. (#4515) * add tests for pythonic and non-pythonic component outputs * fix: graph for non-pythonic container output's names Loading container component from component.yaml creates both pythonic and original output names. Graph component iterated over all outputs, using pythonic-to-output conversion on all. If some of the names are not identical to their pythonic versions, they rised KeyError on the lookup table. This commit fixes this problem by using default value for the lookup. * remove depythonification of outputs - not needed anymore

…ts. Fixes kubeflow#4514. (kubeflow#4515) * add tests for pythonic and non-pythonic component outputs * fix: graph for non-pythonic container output's names Loading container component from component.yaml creates both pythonic and original output names. Graph component iterated over all outputs, using pythonic-to-output conversion on all. If some of the names are not identical to their pythonic versions, they rised KeyError on the lookup table. This commit fixes this problem by using default value for the lookup. * remove depythonification of outputs - not needed anymore

k8s-ci-robot requested review from Ark-kun and numerology September 18, 2020 15:17

google-cla bot added the cla: yes label Sep 18, 2020

k8s-ci-robot added size/M needs-ok-to-test labels Sep 18, 2020

k8s-ci-robot added ok-to-test and removed needs-ok-to-test labels Sep 18, 2020

Ark-kun reviewed Sep 18, 2020

View reviewed changes

Udiknedormin added 2 commits September 21, 2020 14:16

add tests for pythonic and non-pythonic component outputs

aa5cf30

Udiknedormin force-pushed the fix/graph_component_pythonic_outputs branch from b343cdd to 368d29b Compare September 21, 2020 12:18

remove depythonification of outputs - not needed anymore

5d80cb5

k8s-ci-robot added size/L and removed size/M labels Sep 22, 2020

Udiknedormin force-pushed the fix/graph_component_pythonic_outputs branch 3 times, most recently from dccbb46 to bef604c Compare September 22, 2020 17:36

Udiknedormin force-pushed the fix/graph_component_pythonic_outputs branch from bef604c to 5d80cb5 Compare September 25, 2020 13:16

k8s-ci-robot added size/M and removed size/L labels Sep 25, 2020

k8s-ci-robot assigned Ark-kun Sep 29, 2020

k8s-ci-robot added the lgtm label Sep 29, 2020

k8s-ci-robot added the approved label Sep 29, 2020

Ark-kun added the cherrypick-approved area OWNER approves to cherry pick this PR to current active release branch label Sep 29, 2020

k8s-ci-robot merged commit 0b31879 into kubeflow:master Sep 29, 2020

Bobgy added the cherrypicked cherry picked to release branch `release-x.y` label Oct 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sdk): Allow non-pythonic names for graph components' task's outputs. Fixes #4514. #4515

fix(sdk): Allow non-pythonic names for graph components' task's outputs. Fixes #4514. #4515

Udiknedormin commented Sep 18, 2020

kubeflow-bot commented Sep 18, 2020

k8s-ci-robot commented Sep 18, 2020

Tomcli commented Sep 18, 2020

Tomcli commented Sep 18, 2020

Tomcli commented Sep 18, 2020

numerology commented Sep 18, 2020

Tomcli commented Sep 18, 2020 •

edited

Loading

Ark-kun Sep 18, 2020

Udiknedormin Sep 21, 2020

Udiknedormin Sep 21, 2020

Ark-kun Sep 21, 2020

Udiknedormin Sep 22, 2020

Ark-kun commented Sep 20, 2020

Udiknedormin commented Sep 21, 2020

Ark-kun commented Sep 21, 2020

Udiknedormin commented Sep 22, 2020

Ark-kun commented Sep 25, 2020

Udiknedormin commented Sep 25, 2020

Ark-kun commented Sep 29, 2020

k8s-ci-robot commented Sep 29, 2020

fix(sdk): Allow non-pythonic names for graph components' task's outputs. Fixes #4514. #4515

fix(sdk): Allow non-pythonic names for graph components' task's outputs. Fixes #4514. #4515

Conversation

Udiknedormin commented Sep 18, 2020

kubeflow-bot commented Sep 18, 2020

k8s-ci-robot commented Sep 18, 2020

Tomcli commented Sep 18, 2020

Tomcli commented Sep 18, 2020

Tomcli commented Sep 18, 2020

numerology commented Sep 18, 2020

Tomcli commented Sep 18, 2020 • edited Loading

Ark-kun Sep 18, 2020

Choose a reason for hiding this comment

Udiknedormin Sep 21, 2020

Choose a reason for hiding this comment

Udiknedormin Sep 21, 2020

Choose a reason for hiding this comment

Ark-kun Sep 21, 2020

Choose a reason for hiding this comment

Udiknedormin Sep 22, 2020

Choose a reason for hiding this comment

Ark-kun commented Sep 20, 2020

Udiknedormin commented Sep 21, 2020

Ark-kun commented Sep 21, 2020

Udiknedormin commented Sep 22, 2020

Ark-kun commented Sep 25, 2020

Udiknedormin commented Sep 25, 2020

Ark-kun commented Sep 29, 2020

k8s-ci-robot commented Sep 29, 2020

Tomcli commented Sep 18, 2020 •

edited

Loading