Skip to content

Support Binary --> String coercion for StringView/BinaryView in LIKE #12500

Closed
@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

Part of #11752

While working on enabling StringView by default in #12092 I found another feature gap that occurs in the ClickBench benchmarks

ClickBench hits_partitioned has a column that resolves to Binary (BinaryView after #12092) that is then treated as a String (compared to a string, etc).

In order for this to work, DataFusion needs to know it is ok to cast to String. It knows how to do this for Binary --> Utf8 but not BinaryView --> Utf8View, etc. Without this running ClickBench on hits_partitioned does not work.

A small example is like

> create table foo as values (arrow_cast('one', 'BinaryView'), arrow_cast('two', 'BinaryView'));
0 row(s) fetched.
Elapsed 0.006 seconds.

> select column1 like 'o%' from foo;
type_coercion
caused by
Error during planning: There isn't a common type to coerce BinaryView and Utf8 in LIKE expression

Describe the solution you'd like

  1. Add coercion rules for BinaryView --> Utf8/Utf8View
  2. Add a test

Describe alternatives you've considered

Fix the relevant code here:

(Binary, Utf8) => Some(Utf8),
(Binary, LargeUtf8) => Some(LargeUtf8),
(LargeBinary, Utf8) => Some(LargeUtf8),
(LargeBinary, LargeUtf8) => Some(LargeUtf8),
(Utf8, Binary) => Some(Utf8),
(Utf8, LargeBinary) => Some(LargeUtf8),
(LargeUtf8, Binary) => Some(LargeUtf8),
(LargeUtf8, LargeBinary) => Some(LargeUtf8),
_ => None,

(you can see what I had to do on #12092)

Add a test in sqllogictest

Perhaps in this file:

# LIKE
query ?
SELECT binary FROM t where binary LIKE '%F%';
----
466f6f
466f6f426172

Additional context

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions