-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Reduce rehashing cost for primitive grouping by also reusing hash value #15962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce rehashing cost for primitive grouping by also reusing hash value #15962
Conversation
I think the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Rachelint -- I'll run the benchmarks
@@ -85,7 +85,7 @@ pub struct GroupValuesPrimitive<T: ArrowPrimitiveType> { | |||
/// | |||
/// We don't store the hashes as hashing fixed width primitives | |||
/// is fast enough for this not to benefit performance | |||
map: HashTable<usize>, | |||
map: HashTable<(usize, u64)>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems like the comment right above this line is no longer accurate and needs to be updated as this PR exactly saves the hashes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
🤖 |
🤖: Benchmark completed Details
|
d565b48
to
454e789
Compare
Addressed. |
454e789
to
f2b5349
Compare
The result is as expected, the 3 fasters are all group by primitive + high cardinality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Rachelint and @Dandandan
Which issue does this PR close?
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?