Skip to content

Conversation

@friendlymatthew
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

This PR introduces a custom PartialEq impl for UnionFields that uses set semantics when determining equality

Currently, UnionFields derives PartailEq, but its internal representation is a list of (id, data type) tuples. As a result, equality is order-dependent, even though UnionFields is conceptually a set of unique (id, data type) pairs

@github-actions github-actions bot added the arrow Changes to the arrow crate label Dec 1, 2025
@friendlymatthew
Copy link
Contributor Author

cc @alamb this should be fairly quick to review!

}
}

impl PartialEq for UnionFields {
Copy link
Contributor

@tobixdev tobixdev Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a possible way of getting the desired behavior. In our project, we do have some mildly large unions (10-ish currently, but maybe more in the future) that occur very frequently and I am a bit anxious on how the quadratic equality check will perform in practice, as the equality of a data type is often used when comparing fields, schemas, and logical plans.

Another approach would be sorting the slice when constructing the instance. Maybe this would be an equally simple change with better performance as this already a private field.

Note that I have no idea how this will actually perform as I have no benchmarks. Just my 2 cents.

Otherwise, this looks like a good change!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect PartialEq behavior for UnionFields

2 participants