Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support default values in typing.List[dataclass] and typing.Dict[str, dataclass] #2603

Merged
merged 53 commits into from
Aug 26, 2024

Conversation

mao3267
Copy link
Contributor

@mao3267 mao3267 commented Jul 23, 2024

Tracking issue

flyteorg/flyte#5569

Why are the changes needed?

We want to support typing.List[dataclass] and typing.Dict[dataclass] to fix the bug.

What changes were proposed in this pull request?

  1. Do convert recursively in JsonParamType to handle nested dataclasses.
  2. Fix errors while converting from literal back to python value with default value = None (type_engine.py)
  3. Unit tests for nested dataclasses

How was this patch tested?

import typing
from dataclasses_json import dataclass_json
from flytekit import task, workflow

from dataclasses import dataclass


@dataclass_json
@dataclass
class User:
    name: str = "default"
    github_handle: str = "default"


@task
def multiple_items_task(users: typing.List[User]) -> typing.List[User]:
    return users


@workflow
def multiple_items(users: typing.List[User]) -> typing.List[User]:
    return multiple_items_task(users=users)

@task
def multiple_items_dict_task(users: typing.Dict[str, User]) -> typing.Dict[str, User]:
    return users

@workflow
def multiple_items_dict(users: typing.Dict[str, User]) -> typing.Dict[str, User]:
    return multiple_items_dict_task(users=users)

Simple Tests in CLI:

Dict[str, dataclass]

image

List[dataclass]

image

Run on remote

image image

Unit test

A new unit test test_nested_dataclass_type is added to check if it supports nested dataclass types and default values.

Setup process

git clone https://github.com/mao3267/flytekit.git
git checkout handle-dataclass-default-value
make setup && pip install -e .

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

None

Docs link

TODO

Copy link

codecov bot commented Aug 2, 2024

Codecov Report

Attention: Patch coverage is 91.30435% with 2 lines in your changes missing coverage. Please review.

Project coverage is 71.67%. Comparing base (54f0a46) to head (cd8877f).
Report is 1 commits behind head on master.

Files Patch % Lines
flytekit/interaction/click_types.py 87.50% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #2603       +/-   ##
===========================================
+ Coverage   39.16%   71.67%   +32.51%     
===========================================
  Files         187      187               
  Lines       19236    19221       -15     
  Branches     4019     2785     -1234     
===========================================
+ Hits         7533    13776     +6243     
+ Misses      11470     4770     -6700     
- Partials      233      675      +442     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mao3267 mao3267 changed the title [WIP] Support default values in list[dataclass] and dict[dataclass] Support default values in list[dataclass] and dict[dataclass] Aug 2, 2024
mao3267 and others added 28 commits August 2, 2024 21:34
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
* truncate sagemaker agent outputs

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix tests and update agent output

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* lint

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix test

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add idempotence token to workflow

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix type

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix mixin

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* modify output handler

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* make the dictionary deterministic

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* nit

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

---------

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
…tionable (flyteorg#2594)

Signed-off-by: Fabio Grätz <fabiogratz@googlemail.com>
Co-authored-by: Fabio Grätz <fabiogratz@googlemail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
…-auth credentials in auth interceptor (flyteorg#2591)

Signed-off-by: Fabio Grätz <fabiogratz@googlemail.com>
Co-authored-by: Fabio Grätz <fabiogratz@googlemail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
* validate idempotence token length in subsequent tasks

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* remove redundant param

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add tests

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

---------

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
…teorg#2602)

* eliminate redundant literal conversion for  type

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add test

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* lint

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add isclass check

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

---------

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
* add nim plugin

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* move nim to inference

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* import fix

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix port

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add pod_template method

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add containers

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* update

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* clean up

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* remove cloud import

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix extra config

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* remove decorator

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add tests, update readme

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add env

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add support for lora adapter

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* minor fixes

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add startup probe

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* increase failure threshold

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* remove ngc secret group

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* move plugin to flytekit core

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix docs

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* remove hf group

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* modify podtemplate import

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix import

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix ngc api key

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix tests

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix formatting

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* lint

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* docs fix

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* docs fix

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* update secrets interface

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add secret prefix

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* fix tests

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add urls

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add urls

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* remove urls

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* minor modifications

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* remove secrets prefix; add failure threshold

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add hard-coded prefix

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* add comment

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* make secrets prefix a required param

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* move nim to flytekit plugin

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* update readme

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* update readme

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

* update readme

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>

---------

Signed-off-by: Samhita Alla <aallasamhita@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: Thomas J. Fan <thomasjpfan@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
)

Signed-off-by: aditya7302 <aditya7302@gmail.com>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Co-authored-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: Jan Fiedler <jan@union.ai>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: JackUrb <jack@datologyai.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
* set up array node

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* wip array node task wrapper

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* support function like callability

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* temp check in some progress on python func wrapper

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* only support launch plans in new array node class for now

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* add map task array node implementation wrapper

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* ArrayNode only supports LPs for now

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* support local execute for new array node implementation

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* add local execute unit tests for array node

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* set exeucution version in array node spec

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* check input types for local execute

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* remove code that is un-needed for now

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* clean up array node class

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* improve naming

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* clean up

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* utilize enum execution mode to set array node execution path

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* default execution mode to FULL_STATE for new array node class

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* support min_successes for new array node

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* add map task wrapper unit test

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* set min successes for array node map task wrapper

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* update docstrings

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* Install flyteidl from master in plugins tests

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>

* lint

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* clean up min success/ratio setting

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* lint

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

* make array node class callable

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>

---------

Signed-off-by: Paul Dittamo <pvdittamo@gmail.com>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Co-authored-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Co-authored-by: Future-Outlier <eric901201@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
* Added alt prefix head to FlyteFile.new_remote

Signed-off-by: pryce-turner <pryce.turner@gmail.com>

* Added get_new_path method to FileAccessProvider, fixed new_remote method of FlyteFile

Signed-off-by: pryce-turner <pryce.turner@gmail.com>

* Updated tests and added new path creator to FlyteFile/Dir new_remote methods

Signed-off-by: pryce-turner <pryce.turner@gmail.com>

* Improved docstrings, fixed minor path sep bug, more descriptive naming, better test

Signed-off-by: pryce-turner <pryce.turner@gmail.com>

* Formatting

Signed-off-by: pryce-turner <pryce.turner@gmail.com>

---------

Signed-off-by: pryce-turner <pryce.turner@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
* Remove use of multiprocessing from the OAuth client

Signed-off-by: Robert Deaton <robert.deaton@freenome.com>

* Lint

Signed-off-by: Robert Deaton <robert.deaton@freenome.com>

---------

Signed-off-by: Robert Deaton <robert.deaton@freenome.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
@mao3267
Copy link
Contributor Author

mao3267 commented Aug 7, 2024

cc @pingsutw Please help me review. In this PR, we have supported list[dataclass], dict[dataclass], and more nested dataclass types. However, since we haven't supported the conversion of dataclass with Flyte types members, now we also don't support other nested dataclass with Flyte Types.

from dataclasses import dataclass

@dataclass
class Datum:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! could we also add a few more tests?

  1. Another dataclass in the Datum
  2. List of dataclass in the Datum
  3. Flyte type in the dataclass (expect to see an error?)
  4. Add a nested dataclass in this test

# In python 3.7, 3.8, DataclassJson will deserialize Annotated[StructuredDataset, kwtypes(..)] to a dict,
# so here we convert it back to the Structured Dataset.
from flytekit.types.structured import StructuredDataset

if python_val is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when will python_val be None here? could we add a small test for it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When dataclass members of type typing.Dict or typing.List default to None, this can lead to exceptions when iterating the list/dict from L552-588. We do need to add a test here.

@@ -307,9 +309,29 @@ def convert(

parsed_value = self._parse(value, param)

def has_nested_dataclass(t: typing.Type) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some comment for this function. If I pass a dataclass (not flyte type) to this function, it will return True here as well. That seems like not correct.

mao3267 and others added 10 commits August 13, 2024 22:38
Signed-off-by: mao3267 <chenvincent610@gmail.com>
… None, and flyte type exceptions

Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: mao3267 <chenvincent610@gmail.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Comment on lines +581 to +587
if python_val is None:
return None
return [self._make_dataclass_serializable(v, get_args(python_type)[0]) for v in cast(list, python_val)]

if hasattr(python_type, "__origin__") and get_origin(python_type) is dict:
if python_val is None:
return None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing, thank you for fixing this.

…_with_optional_fields

Signed-off-by: Future-Outlier <eric901201@gmail.com>
Comment on lines +343 to +344
if get_args(self._python_type) == ():
return parsed_value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this for?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment added by Vincent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for native list and dict types, which don't have covariant types. Therefore, we can not index the return value from get_args.

Future-Outlier and others added 2 commits August 26, 2024 11:27
Signed-off-by: Future-Outlier <eric901201@gmail.com>
…ass, add comments

Signed-off-by: mao3267 <chenvincent610@gmail.com>
@Future-Outlier Future-Outlier changed the title Support default values in list[dataclass] and dict[dataclass] Support default values in typing.List[dataclass] and typing.Dict[dataclass] Aug 26, 2024
Copy link
Member

@Future-Outlier Future-Outlier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you so much

@Future-Outlier Future-Outlier enabled auto-merge (squash) August 26, 2024 06:21
@Future-Outlier Future-Outlier changed the title Support default values in typing.List[dataclass] and typing.Dict[dataclass] Support default values in typing.List[dataclass] and typing.Dict[str, dataclass] Aug 26, 2024
@Future-Outlier Future-Outlier merged commit 83b90fa into flyteorg:master Aug 26, 2024
99 of 100 checks passed
@mao3267 mao3267 deleted the handle-dataclass-default-value branch November 11, 2024 06:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.