-
Notifications
You must be signed in to change notification settings - Fork 333
Bugfix: Pydantic deserialization for FlyteFile and FlyteDirectory #3339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Bugfix: Pydantic deserialization for FlyteFile and FlyteDirectory #3339
Conversation
Thank you for opening this pull request! 🙌 These tips will help get your PR across the finish line:
|
if info.context is None or info.context.get("deserialize") is not True: | ||
return self | ||
# Check if all private attributes are already set up (e.g., from __init__) | ||
if hasattr(self, "_downloader") and hasattr(self, "_remote_source"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
Do we need to check for the other private vars like _downloaded
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or will they all together be set or not set anyways?
|
||
dict_obj = {"path": str(self.path)} | ||
metadata = getattr(self, "metadata", None) | ||
if metadata is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sanity check:
@gverkes did you test whether we need to do this for flyte directory too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested and can confirm that the fix works, thanks for fixing 🙏
My approval appears to be not enough to merge, I'm not a code owner for those dirs. Could you please take a look @davidmirror-ops or @pingsutw ? 🙏
Fixes #6669 Signed-off-by: Govert Verkes <gmverkes@gmail.com> Signed-off-by: Govert Verkes <govert@cusp.ai>
82990a6
to
628cf96
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3339 +/- ##
===========================================
+ Coverage 48.48% 79.32% +30.83%
===========================================
Files 236 216 -20
Lines 23723 22687 -1036
Branches 2970 2973 +3
===========================================
+ Hits 11502 17996 +6494
+ Misses 11643 3854 -7789
- Partials 578 837 +259 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…y validators Signed-off-by: Govert Verkes <govert@cusp.ai>
e279566
to
f4a16e4
Compare
Fixes #6669
Tracking issue
Closes flyteorg/flyte#6669
Why are the changes needed?
When deserializing Pydantic models containing
FlyteFile
orFlyteDirectory
fields usingmodel_validate()
, the deserialized objects weremissing private attributes (
_remote_source
,_downloader
, etc.). This caused anAttributeError
when attempting to re-serialize theseobjects with
model_dump()
, breaking the serialize → deserialize → serialize cycle.This is a critical bug that prevents users from using FlyteFile/FlyteDirectory within Pydantic BaseModel classes in a normal way.
What changes were proposed in this pull request?
Added Pydantic model validators to both
FlyteFile
andFlyteDirectory
classes that ensure private attributes are properly initialized duringdeserialization:
FlyteFile (
flytekit/types/file/file.py
):deserialize_flyte_file
validator to check if private attributes existdict_to_flyte_file()
transformerFlyteDirectory (
flytekit/types/directory/types.py
):deserialize_flyte_dir
validatordict_to_flyte_directory()
to properly reconstruct the objectHow was this patch tested?
Added two new unit tests in
test_pydantic_basemodel_transformer.py
:test_flytefile_pydantic_model_dump_validate_cycle
- Verifies FlyteFile can be serialized, deserialized, and re-serialized without errorstest_flytedirectory_pydantic_model_dump_validate_cycle
- Same for FlyteDirectoryCheck all the applicable boxes
Summary by Bito
This pull request fixes a critical bug in deserializing Pydantic models with FlyteFile and FlyteDirectory fields by adding model validators for private attributes. It ensures proper initialization during deserialization and includes new unit tests to validate the functionality of these enhancements.