ISSUE-4264: simd selection #4271

platoneko · 2022-02-28T03:47:32Z

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

Summary

Summary about this PR

Changelog

New Feature
Performance Improvement

Related Issues

Fixes #4264

Test Plan

Unit Tests

Stateless Tests

vercel · 2022-02-28T03:47:37Z

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/databend/databend/D3GjDedyAJFZxaNkcigR8m9bUV8t
✅ Preview: https://databend-git-fork-platoneko-simd-select-databend.vercel.app

[Deployment for bd08a7c canceled]

mergify · 2022-02-28T03:48:05Z

Thanks for the contribution!
I have applied any labels matching special text in your PR Changelog.

Please review the labels and make any necessary changes.

sundy-li · 2022-02-28T04:54:24Z

common/datavalues/src/lib.rs

@@ -18,6 +18,7 @@

 #![feature(generic_associated_types)]
 #![feature(trusted_len)]
+#![feature(core_intrinsics)]


use https://doc.rust-lang.org/std/primitive.u32.html#method.trailing_zeros instead

sundy-li · 2022-02-28T04:56:17Z

Almost Looks good to me.

I just posted a pr in arrow2, you can have a look, but you don't have to follow that pr's behavior.

sundy-li · 2022-02-28T04:57:18Z

common/datavalues/src/columns/primitive/mod.rs

+            } else {
+                while mask != 0 {
+                    let n = std::intrinsics::cttz(mask) as usize;
+                    res.push(values[offset + n]);


maybe better to use ptr.write to have better performance.

maybe better to use ptr.write to have better performance.

I'd like to have a try

sundy-li · 2022-02-28T04:58:09Z

common/datavalues/src/columns/primitive/mod.rs

+        let values = self.values();
+
+        const MASK_BITS: usize = 64;
+        for mut mask in filter.values().chunks::<u64>() {


should care about the remaining in chunks.

I have checked that Bitmap::chunks::<u64>() won't yield chunk less than 64

In my unittest, I set N = 1000 to make Column not aligned to 64.

platoneko · 2022-02-28T05:20:49Z

common/datavalues/src/columns/primitive/mod.rs

            .filter(|(_, f)| *f)
-            .map(|(v, _)| *v);
+        {
+            res.push(v);


Processing the remaining in chunks is here, maybe not elegant :(

It's not elegant. skip may cause extra iterating.

sundy-li · 2022-02-28T22:35:31Z

common/datavalues/src/columns/primitive/mod.rs

+
+        const CHUNK_SIZE: usize = 64;
+        let mut chunks = self.values().chunks_exact(CHUNK_SIZE);
+        let mut mask_chunks = filter.values().chunks::<u64>();


There is a faster path for mask_chunks, if the filter has no offsets, it could use chunks_exact to have better performance. See: https://github.com/jorgecarleitao/arrow2/blob/main/src/compute/filter.rs#L142-L147

There are two approaches:

Use generic dispatch like BitChunkIterExact

if offset > 0, use an extra loop to consume the offset, this is called Header Loops, then we can continue the Main Loops && Tail Loops.

The second one may have better performance because chunks.iter() must merge two u64 values to generate one merged u64, see: merge_reversed in arrow2

sundy-li · 2022-03-07T00:37:27Z

Seems this is faster by 25%:

Main:


MySQL [(none)]> select count() from numbers_mt(10000000000) where rand()  > 0.90;
+------------+
| count()    |
+------------+
| 1000011915 |
+------------+
1 row in set (1.942 sec)

Now after merge main (#4263):

MySQL [(none)]> select count() from numbers_mt(10000000000) where rand()  > 0.90;
+------------+
| count()    |
+------------+
| 1000013059 |
+------------+
1 row in set (1.486 sec)

sundy-li · 2022-03-07T00:42:51Z

@mergify update

mergify · 2022-03-07T00:43:27Z

update

✅ Branch has been successfully updated

Hey, I reacted but my real name is @Mergifyio

sundy-li · 2022-03-07T00:46:16Z

@platoneko LGTM now. Would you want this pr to be merged? Or maybe you want to continue to add this into BooleanColumn && StringColumn in this pr?

sundy-li · 2022-03-07T08:11:14Z

@mergify update

mergify · 2022-03-07T08:11:22Z

update

✅ Branch has been successfully updated

Hey, I reacted but my real name is @Mergifyio

platoneko · 2022-03-07T15:37:30Z

I want this pr to be merged :)

databend-bot added the need-review label Feb 28, 2022

vercel bot temporarily deployed to Preview February 28, 2022 03:47 Inactive

mergify bot added pr-feature this PR introduces a new feature to the codebase pr-performance labels Feb 28, 2022

simd selection

e599767

platoneko force-pushed the simd-select branch from 6aed4f4 to e599767 Compare February 28, 2022 03:53

vercel bot temporarily deployed to Preview February 28, 2022 03:53 Inactive

platoneko changed the title ~~simd selection~~ ISSUE-4264: simd selection Feb 28, 2022

sundy-li reviewed Feb 28, 2022

View reviewed changes

make clippy happy

9071bc3

sundy-li reviewed Feb 28, 2022

View reviewed changes

vercel bot temporarily deployed to Preview February 28, 2022 04:58 Inactive

platoneko commented Feb 28, 2022

View reviewed changes

directly copy via ptr

9b4ccf1

vercel bot temporarily deployed to Preview February 28, 2022 15:46 Inactive

sundy-li reviewed Feb 28, 2022

View reviewed changes

use extra loop to consume offset

43d5d5e

vercel bot temporarily deployed to Preview March 6, 2022 17:59 Inactive

Merge branch 'main' into simd-select

ad8060c

vercel bot deployed to Preview March 7, 2022 00:43 View deployment

Merge branch 'main' into simd-select

bd08a7c

vercel bot temporarily deployed to Preview March 7, 2022 08:11 Inactive

platoneko marked this pull request as ready for review March 7, 2022 17:04

sundy-li merged commit 3a66496 into databendlabs:main Mar 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ISSUE-4264: simd selection #4271

ISSUE-4264: simd selection #4271

platoneko commented Feb 28, 2022

vercel bot commented Feb 28, 2022 •

edited

Loading

mergify bot commented Feb 28, 2022

sundy-li Feb 28, 2022

platoneko Feb 28, 2022

sundy-li commented Feb 28, 2022

sundy-li Feb 28, 2022

platoneko Feb 28, 2022

sundy-li Feb 28, 2022

platoneko Feb 28, 2022 •

edited

Loading

platoneko Feb 28, 2022

platoneko Feb 28, 2022

sundy-li Feb 28, 2022

sundy-li Feb 28, 2022

sundy-li commented Mar 7, 2022 •

edited

Loading

sundy-li commented Mar 7, 2022

mergify bot commented Mar 7, 2022

sundy-li commented Mar 7, 2022

sundy-li commented Mar 7, 2022

mergify bot commented Mar 7, 2022

platoneko commented Mar 7, 2022

ISSUE-4264: simd selection #4271

ISSUE-4264: simd selection #4271

Conversation

platoneko commented Feb 28, 2022

Summary

Changelog

Related Issues

Test Plan

vercel bot commented Feb 28, 2022 • edited Loading

mergify bot commented Feb 28, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sundy-li commented Feb 28, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

platoneko Feb 28, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sundy-li commented Mar 7, 2022 • edited Loading

sundy-li commented Mar 7, 2022

mergify bot commented Mar 7, 2022

✅ Branch has been successfully updated

sundy-li commented Mar 7, 2022

sundy-li commented Mar 7, 2022

mergify bot commented Mar 7, 2022

✅ Branch has been successfully updated

platoneko commented Mar 7, 2022

vercel bot commented Feb 28, 2022 •

edited

Loading

platoneko Feb 28, 2022 •

edited

Loading

sundy-li commented Mar 7, 2022 •

edited

Loading