Skip to content

feat: support union type for basic types #510

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 94 commits into
base: main
Choose a base branch
from

Conversation

chardoncs
Copy link

@chardoncs chardoncs commented May 18, 2025

Here is my solution on the union type support. It includes basic conversions and representations for the union type.

If there are any problems with the change, feel free to speak out for revision or reject.

Context

This PR is an attempt to solve Issue #436. It includes descriptions and conceptual works about union type support.

Changes

  • Added union in the basic value type, including a new util struct UnionType.
    • [Breaking] Added the union option to the BasicValueType.
    • UnionType includes a BTreeSet to store the types, with extra features.
      • Deduplication: E.g., int | str | int is int | str.
      • Auto-sorting. The current sorting strategy is using the same order of the definition in BasicValueType. It might be > changed to explicit ranks.
      • Auto-flattening: E.g., int | (int | str) | str | float is int | str | float
      • If there is only one type, the parser returns the exact type.
      • String guessing: If some of UUID, DateTime (LocalDateTime, OffsetDateTime, Date, and Time), JSON, and String appear in a union at the same time, the parser will try parsing the value against each possible type in the reversed order of the definition (might be changed to explicit ranks).
  • Union type conversion for Qdrant.
  • Union type conversion for PostgreSQL.
  • JSON Schema for union type, using oneOf.
  • Format display: Union[type1 | type2 | ...].
  • [Breaking] Added union support for type encoder in Python API.~

Copy link
Member

@badmonster0 badmonster0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience! Very close!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that TypedValue in this file also needs to be updated. for UnionType, we should only serialize the value part, without the tag.

@chardoncs
Copy link
Author

I just spared some time to check the changes. Looks like everything has been addressed for now. But it needs thorough testing.

@badmonster0
Copy link
Member

badmonster0 commented Jun 7, 2025

I just spared some time to check the changes. Looks like everything has been addressed for now. But it needs thorough testing.

That's great, thanks!

I just added a testutil validate_full_roundtrip():

def validate_full_roundtrip(
value: Any, output_type: Any, input_type: Any | None = None
) -> None:

It can be used to in test to cover multiple encode/decode/serialize/deserialize functions on both Rust and Python side. Would you try to add a few tests for union types with this and see if it passes?

@chardoncs
Copy link
Author

Sorry for the late reply. I was sick these days and couldn't think or code.

I will check the change and add test cases, and try not to delay this PR any further.

@badmonster0
Copy link
Member

badmonster0 commented Jun 11, 2025

Sorry for the late reply. I was sick these days and couldn't think or code.

I will check the change and add test cases, and try not to delay this PR any further.

@chardoncs I'm sorry to hear you're sick. No worries, take your time and have enough rest! Hope you'll be better soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants