-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Is your feature request related to a problem or challenge?
Part of #11752
StringView is a new arrow array type that allows for more efficient string processing -- specifically it allows string data to be adjusted without copying the underlying data
See this blog post for more details: https://www.influxdata.com/blog/faster-queries-with-stringview-part-one-influxdb/
@Kev1n8 added support for StringView
to the substr
function in #12044
At the moment substr
produces a StringArray
output when the input is StringArray
, but we could actually generate a StringViewArray
as output which would be more efficient in most cases (avoids copying the string values)
However, in order to avoid errors when substr
is used in an expression, we need to make sure that all the rest of the String functions support StringView as input as well. Aka we should wait for the "Required for enabling StringView by default" list on #11752 to be completed
Describe the solution you'd like
- change the output type of
substr
to beStringViewArray
when the input isStringArray
(note forLargeStringArray
we will still need to copy the data I think asStringView
is limited to 2^32 bytes) - Change the implementation of
substr
to useStringView
internally - Add tests
Describe alternatives you've considered
No response
Additional context
Note that @kevin8 has already added support for StringView
to the substr
function in #12044
They also suggested this same optimization could be applied #12044 (comment)