Specialize single column primitive group values by tustvold · Pull Request #7043 · apache/datafusion

tustvold · 2023-07-20T18:25:23Z

Which issue does this PR close?

Part of #6969

Rationale for this change

--------------------
Benchmark tpch_mem.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃     main ┃ specialize-primitive-group-values ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 385.21ms │                          384.94ms │     no change │
│ QQuery 2     │ 102.04ms │                           91.22ms │ +1.12x faster │
│ QQuery 3     │ 104.50ms │                          108.36ms │     no change │
│ QQuery 4     │  69.37ms │                           72.72ms │     no change │
│ QQuery 5     │ 234.38ms │                          248.42ms │  1.06x slower │
│ QQuery 6     │  28.62ms │                           28.20ms │     no change │
│ QQuery 7     │ 550.32ms │                          577.73ms │     no change │
│ QQuery 8     │ 160.42ms │                          153.16ms │     no change │
│ QQuery 9     │ 348.10ms │                          350.41ms │     no change │
│ QQuery 10    │ 201.04ms │                          205.85ms │     no change │
│ QQuery 11    │  95.55ms │                          100.06ms │     no change │
│ QQuery 12    │ 112.20ms │                          112.50ms │     no change │
│ QQuery 13    │ 199.70ms │                          163.78ms │ +1.22x faster │
│ QQuery 14    │  30.56ms │                           31.35ms │     no change │
│ QQuery 15    │  34.63ms │                           30.74ms │ +1.13x faster │
│ QQuery 16    │ 104.27ms │                          107.38ms │     no change │
│ QQuery 17    │ 582.36ms │                          487.69ms │ +1.19x faster │
│ QQuery 18    │ 994.82ms │                          900.53ms │ +1.10x faster │
│ QQuery 19    │ 112.26ms │                          109.53ms │     no change │
│ QQuery 20    │ 204.93ms │                          212.06ms │     no change │
│ QQuery 21    │ 695.60ms │                          690.43ms │     no change │
│ QQuery 22    │  55.44ms │                           55.59ms │     no change │
└──────────────┴──────────┴───────────────────────────────────┴───────────────┘

The slower tests appear to just be noise, at least they don't seem to reproduce consistently

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

tustvold · 2023-07-20T18:26:09Z

datafusion/core/src/physical_plan/aggregates/row_hash.rs

-}
-
-/// A [`GroupValues`] making use of [`Rows`]
-struct GroupValuesRows {


This is moved into a new module

datafusion/core/src/physical_plan/aggregates/group_values/primitive.rs

Dandandan · 2023-07-20T19:22:41Z

datafusion/core/src/physical_plan/aggregates/group_values/primitive.rs

+                    let hash = key.hash(state);
+                    let insert = self.map.find_or_find_insert_slot(
+                        hash,
+                        |g| unsafe { self.values.get_unchecked(*g).is_eq(key) },


this is awesome 🚀

Dandandan · 2023-07-20T19:29:10Z

datafusion/core/src/physical_plan/aggregates/group_values/primitive.rs

+                            Ok(v) => *v.as_ref(),
+                            Err(slot) => {
+                                let g = self.values.len();
+                                self.map.insert_in_slot(hash, slot, g);


I think this should still track the allocated memory (like insert_accounted did?)

It is accounted in the size method

alamb · 2023-07-20T22:00:45Z

I just pushed a fix for the CI failure. I am now running benchmarks on this branch to confirm.

I expect this change may make a substantial difference on some of the ClickBench results as well, but I don't have them automated quite yet

alamb · 2023-07-20T22:29:26Z

My benchmark results are similar. I plan to review this PR carefully tomorrow morning

FYI @yahoNanJing

--------------------
Benchmark tpch_mem.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃ main_base ┃ specialize-primitive-group-values ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │  533.98ms │                          539.33ms │     no change │
│ QQuery 2     │  144.79ms │                          128.73ms │ +1.12x faster │
│ QQuery 3     │  154.18ms │                          151.19ms │     no change │
│ QQuery 4     │  108.38ms │                          113.34ms │     no change │
│ QQuery 5     │  339.26ms │                          345.13ms │     no change │
│ QQuery 6     │   37.56ms │                           38.62ms │     no change │
│ QQuery 7     │  810.72ms │                          762.61ms │ +1.06x faster │
│ QQuery 8     │  222.49ms │                          233.41ms │     no change │
│ QQuery 9     │  505.28ms │                          511.36ms │     no change │
│ QQuery 10    │  306.53ms │                          291.01ms │ +1.05x faster │
│ QQuery 11    │  149.93ms │                          146.22ms │     no change │
│ QQuery 12    │  165.08ms │                          160.02ms │     no change │
│ QQuery 13    │  284.97ms │                          241.23ms │ +1.18x faster │
│ QQuery 14    │   44.42ms │                           43.16ms │     no change │
│ QQuery 15    │   49.69ms │                           41.66ms │ +1.19x faster │
│ QQuery 16    │  155.59ms │                          150.90ms │     no change │
│ QQuery 17    │  752.18ms │                          723.37ms │     no change │
│ QQuery 18    │ 1420.88ms │                         1390.54ms │     no change │
│ QQuery 19    │  161.97ms │                          163.11ms │     no change │
│ QQuery 20    │  299.46ms │                          289.71ms │     no change │
│ QQuery 21    │ 1033.28ms │                          952.11ms │ +1.09x faster │
│ QQuery 22    │   82.75ms │                           83.04ms │     no change │
└──────────────┴───────────┴───────────────────────────────────┴───────────────┘

Dandandan · 2023-07-21T09:16:29Z

datafusion/core/src/physical_plan/aggregates/group_values/mod.rs

+            };
+        }
+
+        // TODO: More primitives


Is this still relevant?

alamb

Thank you @tustvold -- this PR is awesome ❤️ I will rerun the clickbench numbers with this PR once it is merged.

I left a bunch of "improve the comment" type suggestions which I think would add value but are optional and can be done as a follow on PR (I will do them if you choose not to).

I do think we should add hash collision testing prior to merge, which I left a comment about

alamb · 2023-07-21T10:20:42Z

datafusion/core/src/physical_plan/aggregates/group_values/primitive.rs

+    /// Stores the group index based on the hash of its value
+    map: RawTable<usize>,
+    /// The group index of the null value if any
+    null_group: Option<usize>,


alamb · 2023-07-21T10:24:38Z

datafusion/core/src/physical_plan/aggregates/group_values/primitive.rs

+            }
+            EmitTo::First(n) => {
+                // SAFETY: self.map outlives iterator and is not modified concurrently
+                unsafe {


I think this code is largely replicated from the row version. I wonder if it could be refactored into a (templated) common function (with appropriate documentation)?

There isn't an easy way to make this generic, as one stores tuples and one isn't... I at least can't see a way that doesn't just obfuscate the code

alamb · 2023-07-21T10:29:52Z

datafusion/core/src/physical_plan/aggregates/group_values/primitive.rs

+use hashbrown::raw::RawTable;
+use std::sync::Arc;
+
+/// A trait to allow hashing of floating point numbers


Can you explain why this doesn't use create_hashes? And perhaps add comments in the code about the rationale?

If it is important not to use create_hashes I recommend

Move this code to hash_utils.rs so it is easier to find

Implement the force_hash_collisions version (that always hashes things to 0) to ensure appropriate coverage

Here is an example of force_hash_collisions
https://github.com/apache/arrow-datafusion/blob/368f6e606a3cfca8e04638b8d5ff0ff116a20b57/datafusion/physical-expr/src/hash_utils.rs#L214-L224

We can't use create_hashes as we are generating the hashes from the native values, not an array

datafusion/core/src/physical_plan/aggregates/group_values/primitive.rs

alamb · 2023-07-21T10:33:23Z

datafusion/core/src/physical_plan/aggregates/group_values/row.rs

+// specific language governing permissions and limitations
+// under the License.
+
+use crate::physical_plan::aggregates::group_values::GroupValues;


this code is just moved, right?

alamb · 2023-07-21T10:34:28Z

datafusion/core/src/physical_plan/aggregates/group_values/primitive.rs

+
+hash_float!(f16, f32, f64);
+
+/// A [`GroupValues`] storing raw primitive values


Suggested change

/// A [`GroupValues`] storing raw primitive values

/// A [`GroupValues`] storing a single column of raw primitive values

///

/// This specialization is significantly faster than using the more general

/// purpose `Row`s format

alamb · 2023-07-21T12:10:51Z

FYI @yahoNanJing -- I think this PR will make it even easier to evaluate the improvements that a fixed width arrow row format would provide (We can make a specialized GroupsValue under the correct circumstances)

alamb · 2023-07-21T13:38:44Z

I saw some strange results when running with this branch on my clickbench testing. Please standby

alamb · 2023-07-21T15:19:06Z

TLDR is I think this PR is good to go from a performance perspective. I don't see massive gains but I do see small improvements

tustvold added 5 commits July 20, 2023 11:35

Specialize primitive group values

056fd7b

Split module

a7f6a33

RawTable

98384dc

Support all primitives

b5e8391

Add docs

d015363

github-actions bot added the core Core DataFusion crate label Jul 20, 2023

tustvold commented Jul 20, 2023

View reviewed changes

Dandandan reviewed Jul 20, 2023

View reviewed changes

datafusion/core/src/physical_plan/aggregates/group_values/primitive.rs Show resolved Hide resolved

Dandandan reviewed Jul 20, 2023

View reviewed changes

Update datafusion-cli cargo lock

3750df6

alamb changed the title ~~Specialize primitive group values~~ Specialize single column primitive group values Jul 20, 2023

Make Cargo.toml order 'just so'

6654876

Dandandan approved these changes Jul 21, 2023

View reviewed changes

Dandandan reviewed Jul 21, 2023

View reviewed changes

alamb approved these changes Jul 21, 2023

View reviewed changes

alamb mentioned this pull request Jul 21, 2023

Optimize hash_aggregate when there are no null group keys #850

Closed

Review feedback

daf408c

tustvold merged commit 77fafb9 into apache:main Jul 21, 2023

alamb mentioned this pull request Jul 24, 2023

Improve aggregate performance with specialized groups accumulator for single string group by #7064

Closed

tustvold mentioned this pull request Sep 10, 2023

[Website] Aggregating Millions of Groups Fast in Apache Arrow DataFusion 28.0.0 apache/arrow-site#386

Merged


		hash_float!(f16, f32, f64);

		/// A [`GroupValues`] storing raw primitive values

-/// A [`GroupValues`] storing raw primitive values
+/// A [`GroupValues`] storing a single column of raw primitive values
+///
+/// This specialization is significantly faster than using the more general
+/// purpose `Row`s format

Conversation

tustvold commented Jul 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alamb commented Jul 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alamb commented Jul 20, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alamb commented Jul 21, 2023

Uh oh!

alamb commented Jul 21, 2023

Uh oh!

alamb commented Jul 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tustvold commented Jul 20, 2023 •

edited

Loading

alamb commented Jul 20, 2023 •

edited

Loading