Skip to content

COUNT(DISTINCT) on StringView panics: unreachable code: Utf8/Binary should use ArrowBytesSet #11767

Closed
@alamb

Description

@alamb

Describe the bug
I found one of the clickbench extended queries panics when using StringView -- #11723

To Reproduce

> create table foo (id int, x varchar, y varchar) as values (1, 'foo', 'bar'), (2, 'foo', 'baz');
0 row(s) fetched.
Elapsed 0.004 seconds.

> create view foov as select id, arrow_cast(x, 'Utf8View') as x, arrow_cast(y, 'Utf8View') as y from foo;
0 row(s) fetched.
Elapsed 0.002 seconds.

> select count(distinct x), count(distinct y) from foov group by id;
thread 'tokio-runtime-worker' panicked at /Users/andrewlamb/Software/datafusion2/datafusion/physical-expr-common/src/binary_view_map.rs:220:18:
internal error: entered unreachable code: Utf8/Binary should use `ArrowBytesSet`
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
External error: Join Error
caused by
External error: task 65 panicked

Expected behavior
The query should run without panic

Additional context

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions