Improve aggregate performance with specialized groups accumulator for single string group by

### Is your feature request related to a problem or challenge?

DataFusion could be made faster for queries that have a `GROUP BY <string> column`

For example, in ClickBench Q34

```sql
Q34: SELECT "URL", COUNT(*) AS c FROM hits GROUP BY "URL" ORDER BY c DESC LIMIT 10;
```

You can run this query from a datafusion checkout like this (using the code in https://github.com/apache/arrow-datafusion/pull/7060, which hopefully will be merged shortly): 

```shell
# get data
./benchmarks/bench.sh data clickbench_1
# run benchmark
cargo run --release  --bin dfbench -- clickbench --query 34
```

Here is the profile from running 16 cores:

```shell
cargo run --release  --bin dfbench -- clickbench --query 34 --iterations 10 --partitions 16
```

<img width="1790" alt="Screenshot 2023-07-24 at 7 23 09 AM" src="https://github.com/apache/arrow-datafusion/assets/490673/94d71ae7-afc1-4ec9-b9a0-19168c12dc9b">

### Describe the solution you'd like

I would like a special cased `GroupsValue` for this case of a single string (hopefully Utf8, LargeUTf8, Binary, and LargeBinary) column that:
1. Does no allocations per group (aka stores all strings in some single contiguous location)
2. Avoids the Row format / copy of values

Other ideas that could make this faster:
1. Small String optimization
2. special case ASCII (to avoid UTF8 checks for data, like TPCH, that does not contain UTF8 data)

"Small String optimization" refers to the format described in the [umbra paper](https://db.in.tum.de/~freitag/papers/p29-neumann-cidr20.pdf), 

<img width="546" alt="Screenshot 2023-07-24 at 6 38 01 AM" src="https://github.com/apache/arrow-datafusion/assets/490673/967c1956-85e4-46b7-ac75-75620aaa99f5">

This would have to be adapted for Rust / safetly but the same general idea applies (inlining the first few bytes of the string into the hash table for quick "is it equal" comparisons, and then having an offset to an external area for larger strings)

### Describe alternatives you've considered

_No response_

### Additional context

@tustvold's changes in https://github.com/apache/arrow-datafusion/issues/6969 and https://github.com/apache/arrow-datafusion/pull/7043 should make it very easy to code this up as a different GroupValues implementation


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve aggregate performance with specialized groups accumulator for single string group by #7064

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve aggregate performance with specialized groups accumulator for single string group by #7064

Description

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions