Add support for `COUNT(DISTINCT expr, expr1, ...)`

### What is the problem the feature request solves?

The expression `COUNT(DISTINCT expr)` is relatively common and it is used in TPC-H, so it would be good to be able to accelerate this in Comet.

Spark supports multiple expressions e.g. `COUNT(DISTINCT a, b, c)`, but DataFusion does not, so we should only attempt to accelerate this if there is a single input expression.

Implementing this feature is not trivial because there are some design issues with how we currently support partial aggregates. Specifically, we do not report the correct output schema from the partial aggregate. For the aggregate expressions that we currently support it doesn't matter because the output of the partial and final aggregates is the same. For example `SUM(int_column)` will have the output type `int` for both partial and final. For `COUNT(DISTINCT int_column)` the output of the partial will be a **list** of int and the output of the final will be a long.

### Describe the potential solution

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for `COUNT(DISTINCT expr, expr1, ...)` #2292

What is the problem the feature request solves?

Describe the potential solution

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add support for COUNT(DISTINCT expr, expr1, ...) #2292

Description

What is the problem the feature request solves?

Describe the potential solution

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Add support for `COUNT(DISTINCT expr, expr1, ...)` #2292