Non-pythonic names for container component's output breaks loading of graph components #4514

Udiknedormin · 2020-09-18T14:56:43Z

What steps did you take:

Define yaml for container component having an output with non-pythonic name (e.g. out-1).
Define yaml for graph component containing that container as its task and passing its output as the graph output.
Compile inside of pipeline.

Could be illustrated by the following test to component_bridge_tests.py:

def test_nonpythonic_container_output_handled_by_graph(self):
    component_a = textwrap.dedent('''\
      inputs: []
      outputs:
        - {name: out1, type: str}
      implementation:
        graph:
          tasks:
            some-container:
              arguments: {}
              componentRef:
                spec:
                  outputs:
                  - {name: out-1, type: str}
                  implementation:
                    container:
                      image: busybox
                      command: [bash, -c, 'mkdir -p "$(dirname "$0")"; date > "$0"', {outputPath: out-1}]
          outputValues:
            out1:
              taskOutput:
                taskId: some-container
                outputName: out-1
    ''')
    component_b = textwrap.dedent('''\
        inputs:
        - {name: in1, type: str}
        implementation:
          container:
            image: busybox
            command: [echo, {inputValue: in1}]
    ''')
    task_factory_a = load_component_from_text(component_a)
    task_factory_b = load_component_from_text(component_b)
    a_task = task_factory_a()
    b_task = task_factory_b(in1=a_task.outputs['out1'])

What happened:

Compiler fails with KeyError on the non-pythonic output name.

What did you expect to happen:

Should compile just fine, just like it does for pythonic names (e.g. out1 or out_1).

Environment:

Environment-independent.

KFP SDK version: 1.0.0

Anything else you would like to add:

From _component_bridge.py (for current master's HEAD: d4e7398917):

# Previously, ContainerOp had strict requirements for the output names, so we had to
# convert all the names before passing them to the ContainerOp constructor.
# Outputs with non-pythonic names could not be accessed using their original names.
# Now ContainerOp supports any output names, so we're now using the original output names.
# However to support legacy pipelines, we're also adding output references with pythonic names.
# TODO: Add warning when people use the legacy output names.
output_names = [output_spec.name for output_spec in component_spec.outputs or []] # Stabilizing the ordering
output_name_to_python = generate_unique_name_conversion_table(output_names, _sanitize_python_function_name)
for output_name in output_names:
    pythonic_output_name = output_name_to_python[output_name]
    # Note: Some component outputs are currently missing from task.outputs (e.g. MLPipeline UI Metadata)
    if pythonic_output_name not in task.outputs and output_name in task.outputs:
        task.outputs[pythonic_output_name] = task.outputs[output_name]

Looks like the container component defines its outputs by both original and pythonic names.

In _components.py, however, iteration goes through all inputs and pythonic_output_name_to_original is used to map the names to their original versions:

task_outputs_with_original_names = {
  pythonic_output_name_to_original[pythonic_output_name]: output_value
  for pythonic_output_name, output_value in task_outputs_with_pythonic_names.items()
}

Combining the two: if the component bridge is used:

Non-pythonic output names gets listed on the output list twice: by its original name, and by its pythonic name.
pythonic_output_name_to_original gets a mapping pythonic-name --> original-name.
Iteration on all outputs takes both variants.
Pythonic-name variant passes successfully, original-name variant causes KeyError.

Suggested solution:
Make pythonizing the value optional, by adding a default value (di.get(key, default_value) instead of di[key]) equal to the value got. In case of pythonized names, it will still result in the original name. In case of original names, it won't change them. The doubled key will get reduced due to being inside dictionary-constructor.

task_outputs_with_original_names = {
  pythonic_output_name_to_original.get(pythonic_output_name, pythonic_output_name): output_value
  for pythonic_output_name, output_value in task_outputs_with_pythonic_names.items()
}

/kind bug

The text was updated successfully, but these errors were encountered:

Ark-kun · 2020-09-18T23:35:52Z

Thank you for discovering, researching and fixing this issue.

…ts. Fixes #4514. (#4515) * add tests for pythonic and non-pythonic component outputs * fix: graph for non-pythonic container output's names Loading container component from component.yaml creates both pythonic and original output names. Graph component iterated over all outputs, using pythonic-to-output conversion on all. If some of the names are not identical to their pythonic versions, they rised KeyError on the lookup table. This commit fixes this problem by using default value for the lookup. * remove depythonification of outputs - not needed anymore

…ts. Fixes kubeflow#4514. (kubeflow#4515) * add tests for pythonic and non-pythonic component outputs * fix: graph for non-pythonic container output's names Loading container component from component.yaml creates both pythonic and original output names. Graph component iterated over all outputs, using pythonic-to-output conversion on all. If some of the names are not identical to their pythonic versions, they rised KeyError on the lookup table. This commit fixes this problem by using default value for the lookup. * remove depythonification of outputs - not needed anymore

k8s-ci-robot added the kind/bug label Sep 18, 2020

Udiknedormin mentioned this issue Sep 18, 2020

fix(sdk): Allow non-pythonic names for graph components' task's outputs. Fixes #4514. #4515

Merged

Ark-kun self-assigned this Sep 18, 2020

Ark-kun added the area/sdk/components label Sep 18, 2020

k8s-ci-robot closed this as completed in 0b31879 Sep 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-pythonic names for container component's output breaks loading of graph components #4514

Non-pythonic names for container component's output breaks loading of graph components #4514

Udiknedormin commented Sep 18, 2020

Ark-kun commented Sep 18, 2020 •

edited

Loading

Non-pythonic names for container component's output breaks loading of graph components #4514

Non-pythonic names for container component's output breaks loading of graph components #4514

Comments

Udiknedormin commented Sep 18, 2020

What steps did you take:

What happened:

What did you expect to happen:

Environment:

Anything else you would like to add:

Ark-kun commented Sep 18, 2020 • edited Loading

Ark-kun commented Sep 18, 2020 •

edited

Loading