Skip to content

Panic with queries with multiple COUNT DISTINCT aggregates on dictionary values, and a group by #7938

@alamb

Description

@alamb

Describe the bug

DataFusion panic's when runnning a query like

select count(distinct column1), count(distinct column2) from test group by column1;

Where column1 and column2 are Dictionary

To Reproduce

❯ create table test as values (1, arrow_cast('foo', 'Dictionary(Int32, Utf8)')), (2, arrow_cast('bar', 'Dictionary(Int32, Utf8)'));
0 rows in set. Query took 0.002 seconds.

❯ select * from test;
+---------+---------+
| column1 | column2 |
+---------+---------+
| 1       | foo     |
| 2       | bar     |
+---------+---------+
2 rows in set. Query took 0.001 seconds.

❯ select count(distinct column1), count(distinct column2) from test group by column1;
thread 'tokio-runtime-worker' panicked at /Users/alamb/Software/arrow-datafusion/datafusion/common/src/scalar.rs:1846:18:
Unsupported data type Dictionary(Int32, Utf8) for ScalarValue::list_to_array
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
External error: Join Error
caused by
External error: task 66 panicked

Expected behavior

The test should produce the same as when the values are cast to string:

❯ select count(distinct column1::varchar), count(distinct column2::varchar) from test group by column1;
+------------------------------+------------------------------+
| COUNT(DISTINCT test.column1) | COUNT(DISTINCT test.column2) |
+------------------------------+------------------------------+
| 1                            | 1                            |
| 1                            | 1                            |
+------------------------------+------------------------------+

Additional context

I believe this is a regression introduced in #7629 (not yet released)

We found this downstream in IOx

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions