Lazily compute the null count of `Array` to reduce cpu cost in `high cardinality aggr`

**Is your feature request related to a problem or challenge? Please describe what you are trying to do.**

When `group by` columns are of very high cardinality, like clickbench q32, the computation of null count in `slice` function has the non-trivial cpu cost actually(see flamegraph in additional context).
Actually, the null count in `arrow-cpp` will be computed lazily in the scenario that we need to scan the null buffer to get it (such as slice case).
https://github.com/apache/arrow/blob/187197c369058f7d1377c1b161c469a9e4542caf/cpp/src/arrow/array/data.cc#L206-L218

Should `arrow-rust` make `null count` a lazy computation to reduce the cost of `slice`, too?


**Describe the solution you'd like**

Make the `null_count` in `NullBuffer` an `AtomicI64` like `arrow-cpp`.
But I worry other related function calling `null_count` will suffer the performance regression due to introducing the atomic...

**Describe alternatives you've considered**

Maybe we can optimize the `slice()` function in `NullBuffer` only, we check the inputs and found low cost to calculate the new `null_count`. 
For example,
- original offset:0, original len: 10
- new offset: 0, new len: 9
We calculate the `null_count in [9, 10)` and the needed `null_count in [0, 9)` = `original null count` - `null_count in [9, 10)`.

**Additional context**

![flame](https://raw.githubusercontent.com/Rachelint/drawio-store/main/dperf.072801.svg)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Lazily compute the null count of `Array` to reduce cpu cost in `high cardinality aggr` #6146

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Lazily compute the null count of Array to reduce cpu cost in high cardinality aggr #6146

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Lazily compute the null count of `Array` to reduce cpu cost in `high cardinality aggr` #6146