-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
We currently have no way to convert normal (binary) variant data into shredded variant data, even though we have some support for reading shredded data.
Describe the solution you'd like
A new shred_variant function that takes a VariantArray and a DataType as input, and which shreds the variant values according to the data type.
For example, requesting DataType::Int64 would produce an output variant array with the schema:
{
metadata: BINARY,
value: BINARY,
typed_value: LONG,
}
And requesting DataType::Struct with two integer fields a and b would produce an output variant array with the schema:
{
metadata: BINARY,
value: BINARY,
typed_value: {
a: {
value: BINARY,
typed_value: INT,
},
b: {
value: BINARY,
typed_value: INT,
},
}
}
It's an open question whether the function should support re-shredding operations, where the input is allowed to contain a typed_value column of the "wrong" type.
Describe alternatives you've considered
I originally thought variant_get would fill this role, but:
- We don't actually have any way to pass a specific variant shredding schema to
variant_get - Even if we did have a way to pass a shredding schema, the caller would have to provide a physical shredding schema instead of the more intuitive logical shredding schema.
- Even if
variant_getcould express it, and users didn't mind creating physical shredding schemas, it's still nice to have a dedicated name for that subset of functionality