Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Include _data in Event serialization by default. #16841

Merged
merged 2 commits into from
Nov 6, 2024

Conversation

nerdai
Copy link
Contributor

@nerdai nerdai commented Nov 5, 2024

Description

With our Event abstraction, we can store data in Fields, PrivateAttr as well as _data. Specifically, a user can subclass Event to define their own custom Fields, and PrivateAttr. And, _data acts like an underlying dict as well:

  • Any fields passed at init that are not a Field or PrivateAttr of the Event gets added to _data.
  • We can set/get variables from _data via a dictionary-like interface as well as via dot notation
ev = MyCustomEvent(field_1=..., not_a_field=...)
ev.get("not_a_field") = ev.not_a_field

This is kind of is against Pydantic BaseModel convention in that PrivateAttr's start with an underscore "_" and are known to not be included in the model serialization. So, should we include _data by defaul in model serialization?

Type of Change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Your pull-request will likely not be merged unless it is covered by some form of impactful unit testing.

  • I added new unit tests to cover this change

@nerdai nerdai requested a review from masci November 5, 2024 22:14
@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Nov 5, 2024
@@ -68,7 +74,8 @@ def __init__(self, **params: Any):
super().__init__(**fields)
for private_attr, value in private_attrs.items():
super().__setattr__(private_attr, value)
self._data = data
if data:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: the super().__init__(...) call sets the PrivateAttr _data since we defined it's default_factory.

@nerdai nerdai force-pushed the nerdai/include-event-data-in-serialization branch from 8f7fde9 to 45af7da Compare November 5, 2024 22:23
deseriazlied_ev,
)
assert ev.param == deseriazlied_ev.param
assert ev._data == deseriazlied_ev._data
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This last assertion fails without the custom additions to Event.model_dump() in this PR.

Copy link
Member

@masci masci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the thorough explanation!

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Nov 6, 2024
@logan-markewich logan-markewich merged commit 91e5023 into main Nov 6, 2024
11 checks passed
@logan-markewich logan-markewich deleted the nerdai/include-event-data-in-serialization branch November 6, 2024 21:12
YouNeedCryDear pushed a commit to raveharpaz/llama_index that referenced this pull request Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm This PR has been approved by a maintainer size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants