Skip to content

Optimize like/ilike kernels for StringView #5951

@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

@XiangpengHao added support for StringView to the like/ilike kernels in #5931

#5931 (review)

This PR does not leverage the special 4 bytes inlined prefix for large string views, which might be able to optimize certain cases (specifically for quickly testing if the starts_with variant of like doesn't match without having to consult the actual strings, as @wjones127 discusses in #5931 (review)

does not leverage the inlined 4 bytes to shortcut some comparison. I plan to leave this as a future work as it can significantly complicate the code. (I also doubt that we can gain any benefit)

I agree we should only start adding specialized paths if we determine it is worth it. starts_with might be, especially if the parameter is short enough to always be able to use the prefix. We can decide based on benchmarks which are appropriate.

I added a benchmark in #5931

Describe the solution you'd like
Investigate various ways to make the benchmark faster. Run the benchmark like this:

cargo bench --bench comparison_kernels -- utf8view

Describe alternatives you've considered
Maybe it is fast enough as is

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAny new improvement worthy of a entry in the changelog

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions