-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(sdk): Allow non-pythonic names for graph components' task's outputs. Fixes #4514. #4515
fix(sdk): Allow non-pythonic names for graph components' task's outputs. Fixes #4514. #4515
Conversation
Hi @Udiknedormin. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
/test kubeflow-pipelines-tfx-python36 |
It looks like the TFX test failed due to an older version of protobuf. |
Sent #4516 to fix. Sorry for the inconvenience. |
Thanks @numerology |
task_outputs_with_original_names = { | ||
# component_bridge generates outputs under both pythonic and original name, | ||
# so half of them are absent from pythonic_output_name_to_original | ||
pythonic_output_name_to_original.get(pythonic_output_name, pythonic_output_name): output_value | ||
for pythonic_output_name, output_value in task_outputs_with_pythonic_names.items() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be simpler:
task_outputs_with_original_names = { | |
# component_bridge generates outputs under both pythonic and original name, | |
# so half of them are absent from pythonic_output_name_to_original | |
pythonic_output_name_to_original.get(pythonic_output_name, pythonic_output_name): output_value | |
for pythonic_output_name, output_value in task_outputs_with_pythonic_names.items() | |
} | |
task_outputs_with_original_names = {output.name: task_obj.outputs[output.name] for output in task_component_spec.outputs or []} |
output_name_to_pythonic
, pythonic_output_name_to_original
and task_outputs_with_pythonic_names
are not needed then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ark-kun
Was it ever a part of the contract for _container_task_constructor
? It's not backwards-compatible, but it certainly would simplify things.
I wasn't sure about it, considering that _container_task_constructor
can be overwritten and I only know of two overwrites:
_components._create_task_spec_from_component_and_arguments
--- which uses original outputs' names, as seen inTaskSpec._init_outputs
_component_bridghe._create_container_op_from_component_and_arguments
--- which uses both, as shown in the issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the following be ok?
KT = TypeVar('KT') # could be imported from typing, of course
VT = TypeVar('VT')
def get_maybe_remap(m: Mapping[KT, VT], key: KT,
key_mapper: Callable[[KT], KT]) -> VT:
if key in m:
return m[key]
else:
return m[key_mapper(key)]
(...)
task_outputs_with_original_names = {
output.name: get_maybe_pythonic(task_obj.outputs, output.name,
output_name_to_pythonic.__getitem__)
for output in task_component_spec.outputs or []
}
It would remove the need for pythonic_output_name_to_original
(i.e. reverse-mapping of output_name_to_pythonic
) and task_outputs_with_pythonic_names
, leaving just output_name_to_pythonic
"just in case", for backward-compatibility. It's also reusable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was it ever a part of the contract for
_container_task_constructor
? It's not backwards-compatible, but it certainly would simplify things.
I think this should be the contract (original output names should be the keys in the task.outputs
dictionary).
I think at his point it's still OK it make this change. The contract is private and I do not think anyone else has implemented the contact yet (also we're making a significantly more drastic change to it very soon).
We can always restore the behavior is we encounter unfixable case in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ark-kun Ok, if it can be changed, then I'll simplify it and add the explicit contract.
/retest |
Loading container component from component.yaml creates both pythonic and original output names. Graph component iterated over all outputs, using pythonic-to-output conversion on all. If some of the names are not identical to their pythonic versions, they rised KeyError on the lookup table. This commit fixes this problem by using default value for the lookup.
b343cdd
to
368d29b
Compare
Done. |
/retest |
dccbb46
to
bef604c
Compare
@Ark-kun I added the contract and |
I really appreciate you Protocol proposal, but is it possible to stage this work into two PRs? To first just fix the bug and add the tests and then do the refactoring in another PR. |
bef604c
to
5d80cb5
Compare
@Ark-kun Fine by me. Reverted to the minimal version. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Ark-kun The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…ts. Fixes #4514. (#4515) * add tests for pythonic and non-pythonic component outputs * fix: graph for non-pythonic container output's names Loading container component from component.yaml creates both pythonic and original output names. Graph component iterated over all outputs, using pythonic-to-output conversion on all. If some of the names are not identical to their pythonic versions, they rised KeyError on the lookup table. This commit fixes this problem by using default value for the lookup. * remove depythonification of outputs - not needed anymore
…ts. Fixes kubeflow#4514. (kubeflow#4515) * add tests for pythonic and non-pythonic component outputs * fix: graph for non-pythonic container output's names Loading container component from component.yaml creates both pythonic and original output names. Graph component iterated over all outputs, using pythonic-to-output conversion on all. If some of the names are not identical to their pythonic versions, they rised KeyError on the lookup table. This commit fixes this problem by using default value for the lookup. * remove depythonification of outputs - not needed anymore
Pythonic-to-original output name mapping got a default value equal to the output's name. Solves #4514, this is the suggested solution described there. Also adds tests to check if the problem was resolved.
It seems it could be cherry-picked in the current release branch.