Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Error in workflow compilation logic for special variable names #5511

Closed
2 tasks done
ddl-rliu opened this issue Jun 25, 2024 · 4 comments
Closed
2 tasks done

[BUG] Error in workflow compilation logic for special variable names #5511

ddl-rliu opened this issue Jun 25, 2024 · 4 comments
Labels
backlogged For internal use. Reserved for contributor team workflow. bug Something isn't working flytekit FlyteKit Python related issue

Comments

@ddl-rliu
Copy link
Contributor

ddl-rliu commented Jun 25, 2024

Describe the bug

Compile-time validation is inconsistent between workflow run and workflow re-run. Minimal repro:

Special attention on the lines: inputs=kwtypes(**{"my.value": float, "c": float}) and calculate_ellipse_area_shell(**{"my.value": a, "c": b})

import logging

from flytekit import ContainerTask, kwtypes, task, workflow

logger = logging.getLogger(__file__)

calculate_ellipse_area_shell = ContainerTask(
    name="ellipse-area-metadata-shell",
    input_data_dir="/var/inputs",
    output_data_dir="/var/outputs",
    inputs=kwtypes(**{"my.value": float, "c": float}),
    outputs=kwtypes(area=float, metadata=str),
    image="ghcr.io/flyteorg/rawcontainers-shell:v2",
    command=[
        "./calculate-ellipse-area.sh"
    ],
)

@task
def report_all_calculated_areas(
    area_shell: float,
    metadata_shell: str,
):
    logger.info(f"shell: area={area_shell}, metadata={metadata_shell}")

@workflow
def wf(a: float, b: float):
    area_shell, metadata_shell = calculate_ellipse_area_shell(**{"my.value": a, "c": b})

    # Report on all results in a single task to simplify comparison
    report_all_calculated_areas(
        area_shell=area_shell,
        metadata_shell=metadata_shell,
    )

Actual behavior:

  1. (Step 1) Run this workflow, it compiles correctly.
  2. (Step 2) Try to rerun the task, the compilation step fails with an error similar to:
    Failed to run the Node: Failed to create the execution: failed to compile workflow for [resource_type:WORKFLOW project:"x" domain:"development" name:".flytegen.ellipse-area-metadata-shell" version:"x"] with err failed to compile workflow with err Collected Errors: 1 Error 0: Code: SyntaxError, Node Id: ellipse-area-metadata-shell, Description: Failed to parse element [my.value], err [unexpected number of matches [2] [[[my my] [value value]]]].

Expected behavior

This is a tricky bug because there are two options.

Option 1: Special characters in task inputs names are NOT okay

Rationale: flyte inputs names are "like" python variable names, which should not use special characters like . (see my.value in the repro code

Expected behavior: The workflow should actually fail at compile time at step 1. (e.g. perform extra validation)

Option 2: Special characters in task inputs names ARE okay

Rationale: flyte inputs names are a string value, and have no restriction on them unlike in python. For example, a flyte inputs name like 0myvar starts with 0, which is invalid python. However, this kind of input is valid Flyte task definition, and users today may be doing so. Introducing extra validation will break these users.

Expected behavior: Rerunning the task should not lead to any errors (step 2).

Additional context to reproduce

No response

Screenshots

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@ddl-rliu ddl-rliu added bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers labels Jun 25, 2024
Copy link

runllm bot commented Jun 25, 2024

According to the Flyte documentation, Flyte supports defining task inputs and outputs using the kwtypes utility. However, the use of special characters in input names like . is not explicitly mentioned, which might be leading to unexpected behavior during workflow compilation.

Describe the bug

When defining a ContainerTask with input names containing special characters (e.g., my.value), the workflow compiles correctly initially but fails during rerun with a syntax error.

Expected behavior

There are two potential options for handling this:

  1. Special characters in task input names are NOT okay:

    • The workflow should fail at compile time if special characters are used in input names, ensuring that only valid names are allowed.
  2. Special characters in task input names are okay:

    • The workflow should handle special characters correctly and not fail during rerun, ensuring consistent behavior.

Reproduction Steps

  1. Define a ContainerTask with special characters in input names.
  2. Create a workflow that uses this task.
  3. Run the workflow (compiles correctly).
  4. Try to rerun the task (fails with a syntax error).

Example Code

import logging
from flytekit import ContainerTask, kwtypes, task, workflow

logger = logging.getLogger(__file__)

calculate_ellipse_area_shell = ContainerTask(
    name="ellipse-area-metadata-shell",
    input_data_dir="/var/inputs",
    output_data_dir="/var/outputs",
    inputs=kwtypes(**{"my.value": float, "c": float}),
    outputs=kwtypes(area=float, metadata=str),
    image="ghcr.io/flyteorg/rawcontainers-shell:v2",
    command=[
        "./calculate-ellipse-area.sh"
    ],
)

@task
def report_all_calculated_areas(
    area_shell: float,
    metadata_shell: str,
):
    logger.info(f"shell: area={area_shell}, metadata={metadata_shell}")

@workflow
def wf(a: float, b: float):
    area_shell, metadata_shell = calculate_ellipse_area_shell(**{"my.value": a, "c": b})

    report_all_calculated_areas(
        area_shell=area_shell,
        metadata_shell=metadata_shell,
    )

Actual Behavior

  • Step 1: Workflow compiles and runs correctly.
  • Step 2: Rerunning the task fails with a syntax error.

Sources:


This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.

Mention @RunLLM to ask follow-up questions. See runllm.com for more!

@pingsutw pingsutw added flytekit FlyteKit Python related issue and removed untriaged This issues has not yet been looked at by the Maintainers labels Jun 25, 2024
@pingsutw
Copy link
Member

agreed. Workflow should fail at compile time at step 1.

@eapolinario eapolinario added the backlogged For internal use. Reserved for contributor team workflow. label Jun 27, 2024
@ddl-rliu
Copy link
Contributor Author

ddl-rliu commented Jun 28, 2024

I believe this is resolved as per flyteorg/flytekit#2538 Thanks!

@pingsutw
Copy link
Member

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlogged For internal use. Reserved for contributor team workflow. bug Something isn't working flytekit FlyteKit Python related issue
Projects
None yet
Development

No branches or pull requests

3 participants