-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new APIs to AddressableNonNullValueList to copy/append a stream of bytes #8653
Conversation
✅ Deploy Preview for meta-velox canceled.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aditi-pandit Aditi, can you remind me how are you planning to serialize distinct accumulators? Will you extract each accumulator into a single blob (VARBINARY values) or an array of blobs (ARRAY(VARBINARY)) or something else?
Would it be more efficient to serialize / deserialize all entries in bulk rather than one at a time?
@mbasmanova : Answers inline
The implementation extracts each accumulator into a single blob. #8660 has the SetAccumulator serialization code.
Do you mean serialize multiple ComplexTypes in one go ? Or do you mean something else ? One issue for serializing multiple ComplexTypes is that we need a hash of each ComplexType value at a time for ComplexTypeSetAccumulator (using BaseVector::hashValueAt() logic). If there is a way to reconstruct such a hash from the ComplexType value one at a time, then doing a bulk write of values and reconstructing hash on read is possible. Right now, I didn't see a way to do this in the code. But its possible to do some refactoring for it. wdyt ? |
e10b743
to
eae6913
Compare
eae6913
to
a13c28b
Compare
Confirmed with Masha about this limitation for bulk serialize/deserialize. |
@mbasmanova : PTAL. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aditi-pandit Overall looks good % some comments.
6705014
to
be6a221
Compare
@mbasmanova : PTAL. |
be6a221
to
a3b0469
Compare
@mbasmanova : Thanks for your review comments. PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@mbasmanova : There are failures in Facebook Linter for this. Do you need any help from me ? Would be great to get the error messages. |
@mbasmanova merged this pull request in 02a4f6c. |
Conbench analyzed the 1 benchmark run on commit There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
…f bytes (facebookincubator#8653) Summary: This is the first in a set of PRs to add support for spilling distinct aggregations (see full version in facebookincubator#7791). Spilling distinct aggregations needs support to spill SetAccumulators in which input values are cumulated. ComplexTypeSetAccumulators use AddressableNonNullValueList to serialize complex types. This PR adds new APIs to AddressableNonNullValueList so that it can copy/append a stream of bytes corresponding to a ComplexType value (array, map, struct). Pull Request resolved: facebookincubator#8653 Reviewed By: Yuhta Differential Revision: D53497000 Pulled By: mbasmanova fbshipit-source-id: 66d44d02a2c3bd5775725c8b8559feaed17c0813
…f bytes (facebookincubator#8653) Summary: This is the first in a set of PRs to add support for spilling distinct aggregations (see full version in facebookincubator#7791). Spilling distinct aggregations needs support to spill SetAccumulators in which input values are cumulated. ComplexTypeSetAccumulators use AddressableNonNullValueList to serialize complex types. This PR adds new APIs to AddressableNonNullValueList so that it can copy/append a stream of bytes corresponding to a ComplexType value (array, map, struct). Pull Request resolved: facebookincubator#8653 Reviewed By: Yuhta Differential Revision: D53497000 Pulled By: mbasmanova fbshipit-source-id: 66d44d02a2c3bd5775725c8b8559feaed17c0813
…f bytes (facebookincubator#8653) Summary: This is the first in a set of PRs to add support for spilling distinct aggregations (see full version in facebookincubator#7791). Spilling distinct aggregations needs support to spill SetAccumulators in which input values are cumulated. ComplexTypeSetAccumulators use AddressableNonNullValueList to serialize complex types. This PR adds new APIs to AddressableNonNullValueList so that it can copy/append a stream of bytes corresponding to a ComplexType value (array, map, struct). Pull Request resolved: facebookincubator#8653 Reviewed By: Yuhta Differential Revision: D53497000 Pulled By: mbasmanova fbshipit-source-id: 66d44d02a2c3bd5775725c8b8559feaed17c0813
This is the first in a set of PRs to add support for spilling distinct aggregations (see full version in #7791).
Spilling distinct aggregations needs support to spill SetAccumulators in which input values are cumulated. ComplexTypeSetAccumulators use AddressableNonNullValueList to serialize complex types. This PR adds new APIs to AddressableNonNullValueList so that it can copy/append a stream of bytes corresponding to a ComplexType value (array, map, struct).