Skip to content

Support pruning on starts_with #14027

@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

@adriangb implemented PruningPredicate support for prefix matching LIKE / NOT LIKE in

However, it isn't currently supported for the starts_with function

Describe the solution you'd like

I would like predicate pruning to happen for the starts_with function as well

So queries like

select * from my_file where starts_with(col, 'http://')

Could also use starts_with to prune parquet files

Describe alternatives you've considered

The challenge at the moment is that PruningPredicate can't refer directly to the function implementations

Given how optimized LIKE is one possible solution would be to change starts_with so it didn't just call an arrow kernel, but instead was rewritten

https://github.com/apache/datafusion/blob/main/datafusion/functions/src/string/starts_with.rs

So for example, it could be rewritten into Expr::Like by implementing simplity:

https://docs.rs/datafusion/latest/datafusion/logical_expr/trait.ScalarUDFImpl.html#method.simplify

We could do something similar with ends_with as well

Additional context

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions