-
Notifications
You must be signed in to change notification settings - Fork 1k
[Variant] Allow VariantArray::value
to work with owned value bytes
#8430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self-review
VariantArrayValue::Borrowed(v) => v.metadata(), | ||
VariantArrayValue::Owned { metadata, .. } => metadata, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method can be implemented directly, without resorting to the cost of calling self.as_variant_cow()
/// Extracts the underlying [`VariantList`], if this is a variant array. | ||
/// | ||
/// See also [`Variant::as_list`]. | ||
pub fn as_list(&self) -> Option<VariantList<'m, '_>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two as_xxx methods are not as efficient as their Variant
counterparts, because they must return an owned object/list rather than a borrowed reference. But they're also used only in test code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a note in the docs clarifying that this is test-only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, if we think it's important we could mark them #[cfg(test)]
to force them test-only? But I'm not sure it really matters? They produce correct results, just not super-efficiently.
/// Extracts the value of the variant array element at `index`, if this is a variant object. | ||
/// | ||
/// See also [`Variant::get_list_element`]. | ||
pub fn get_list_element(&self, index: usize) -> Option<Variant<'m, '_>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Variant
these two methods return is limited to the lifetime of self
, because it could be owned.
This restriction didn't seem to impact any any code. Only unit tests call these methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were some weird uses of 'm
lifetime that should never have been there and caused lifetime problems in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for the change, just few nits
| ShreddingState::PartiallyShredded { typed_value, .. } | ||
if typed_value.is_valid(index) => | ||
{ | ||
return typed_value_to_variant(typed_value, index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
supernit: match the style with the other arms
return typed_value_to_variant(typed_value, index); | |
typed_value_to_variant(typed_value, index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the return
is needed due to differing return type...
/// Extracts the underlying [`VariantList`], if this is a variant array. | ||
/// | ||
/// See also [`Variant::as_list`]. | ||
pub fn as_list(&self) -> Option<VariantList<'m, '_>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a note in the docs clarifying that this is test-only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @scovich and @liamzwbao for this PR.
One thing I have been thinking about is "how much should we try and optimize unshred
of a single row"
Something feels off to me here about allocating individual Variant objects.
For example, what if the only way to get an unshredded value, was to unshred all the values of the whole array? This would result in a new array where all values were precomputed and could be references.
Would that allow for unshredded lists?
I am not opposed to this approach, but something just feels off to me
It's definitely a good question. Right now, |
Looking at Looking at
If we want to handle the first one, we would need to drill further into the shredded variant data, attempting to follow the paths requested by the user's input shredding schema. |
Once the Variant extension type merges properly, |
@alamb -- I think your intuition was spot-on, PTAL? |
Marking as draft as I don't think this one is waiting on review anymore and we are currently investigating other options |
Which issue does this PR close?
Struct
#8336Rationale for this change
Today,
Variant
must always refer to borrowed bytes, for objects and arrays (primitive values are parsed out and have no byte references at all). Unfortunately, whenVariantArray::value
unshreds a value, the resulting bytes are owned, not borrowed. Which makes it impossible for that method to meaningfully support arrays and objects.What changes are included in this PR?
Define a new
VariantArrayValue
wrapper, which allows returning variant values based on owned or borrowed bytes. It hides most of the complexity from call sites by defining many similar methods asVariant
and also by implementing a strategic selection of traits such asPartialEq
.The biggest user surface change is that
VariantArrayBuilder
code of the form:becomes
Are these changes tested?
Extensive coverage from existing unit tests (many of which were updated to use the new API)
Are there any user-facing changes?
Yes. The new
VariantArrayValue
class is public, andVariantArray::value
type signature changed to return it.