-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] add hash function for xx_hash3_64 #28910
Conversation
Did you test the performance based on this PR compared to murmur_hash3_32? |
Since the implementation is quite simple and straight forward, I'm referred to the official benchmarks among the popular algo in the code "be/src/util/xxhash.h" as following:
|
I have seen the benchmark result. It is better to do tests based on this PR to make sure better performance can even exist in starrocks. |
Got you |
35ad058
to
f6bb8c0
Compare
@goldenbean can you share the performance results of this new function and murmur_hash3_32? |
c8b401c
to
0a7c1c0
Compare
UPDATED Benchmark Time CPU IterationsBM_HashFunctions_Eval/10/1/iterations:10000 910 ns 907 ns 10000 Hardware: Run on (12 X 2100 MHz CPU s) |
Signed-off-by: beans <jing.gao@outlook.com>
Signed-off-by: beans <jing.gao@outlook.com>
0a7c1c0
to
7a0820d
Compare
Form the benchmark, we can see the improvement. But the speed seems not fast enough as Reference: https://github.com/Cyan4973/xxHash. do you know the reason? |
@imay please let me do more tests on commercial server on Monday. |
Signed-off-by: beans <jing.gao@outlook.com>
Benchmark Time CPU IterationsBM_HashFunctions_Eval/10/1/iterations:10000 722 ns 722 ns 10000 Hardware: Run on (64 X 2593.98 MHz CPU s) |
be/src/exprs/hash_functions.h
Outdated
|
||
size_t size = columns[0]->size(); | ||
ColumnBuilder<TYPE_BIGINT> builder(size); | ||
for (int row = 0; row < size; ++row) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest to transform this row-priority two-layer loops into column-priority loops, compiler shall try best to auto vectorize column-priority implement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch ! I will do it soon
@goldenbean better to rename it as |
Signed-off-by: beans <jing.gao@outlook.com>
ok |
Kudos, SonarCloud Quality Gate passed!
|
[FE PR Coverage Check]😍 pass : 0 / 0 (0%) |
xx_hash3_64 has much better performance than murmur_hash3_32 by using AVX2 instruction according their benchmark, and has state-of-art hash quality which is broadly integrated in many software. Reference: https://github.com/Cyan4973/xxHash --------- Signed-off-by: beans <jing.gao@outlook.com>
Hi, can this be backported to 3.1 branch, please? |
@mergify backport branch-3.1 |
✅ Backports have been created
|
xx_hash3_64 has much better performance than murmur_hash3_32 by using AVX2 instruction according their benchmark, and has state-of-art hash quality which is broadly integrated in many software. Reference: https://github.com/Cyan4973/xxHash --------- Signed-off-by: beans <jing.gao@outlook.com> (cherry picked from commit 058e08b)
xx_hash3_64 has much better performance than murmur_hash3_32 by using AVX2 instruction according their benchmark, and has state-of-art hash quality which is broadly integrated in many software. Reference: https://github.com/Cyan4973/xxHash --------- Signed-off-by: beans <jing.gao@outlook.com> (cherry picked from commit 058e08b)
xx_hash3_64 has much better performance than murmur_hash3_32 by using AVX2 instruction according their benchmark, and has state-of-art hash quality which is broadly integrated in many software.
Reference: https://github.com/Cyan4973/xxHash
What type of PR is this:
Checklist:
Bugfix cherry-pick branch check: