- 
        Couldn't load subscription status. 
- Fork 246
Closed
Description
Describe the bug
When translating aggregate expressions to DataFusion we ignore whether the aggregate is distinct or not, resulting in incorrect behavior.
The existing tests seem to pass because the input data does not contain duplicates.
Steps to reproduce
Modify test "single group-by column + aggregate column, multiple batches, no null" to use COUNT(DISTINCT _2) instead of COUNT(DISTINCT _1) and the test fails because the results do not match Spark.
Expected behavior
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working