Closed
Description
Describe the bug
Running a query like this
select sum(distinct x), max(distinct x) from t group by x;
Generates an internal error:
caused by
single_distinct_aggregation_to_group_by
caused by
Internal error: Failed due to generate a different schema
To Reproduce
❯ create table t(x int) as values (1), (2), (1);
❯ select sum(distinct x), max(distinct x) from t group by x;
Optimizer rule 'single_distinct_aggregation_to_group_by' failed
caused by
single_distinct_aggregation_to_group_by
caused by
Internal error: Failed due to generate a different schema, original schema: DFSchema { fields: [DFField { qualifier: None, field: Field { name: "SUM(DISTINCT t.x)", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} } }, DFField { qualifier: None, field: Field { name: "MAX(DISTINCT t.x)", data_type: Int32, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} } }], metadata: {}, functional_dependencies: FunctionalDependencies { deps: [] } }, new schema: DFSchema { fields: [DFField { qualifier: None, field: Field { name: "SUM(DISTINCT t.x)", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} } }, DFField { qualifier: None, field: Field { name: "MAX(DISTINCT t.x)", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} } }], metadata: {}, functional_dependencies: FunctionalDependencies { deps: [] } }.
This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker
### Expected behavior
query should run
### Additional context
Found while looking at https://github.com/apache/arrow-datafusion/issues/7938