-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQL: Add support for single parameter text manipulating functions #31874
SQL: Add support for single parameter text manipulating functions #31874
Conversation
Add the first String functions, namely ASCII and CAST
…less scripting, though.
Add the first String functions, namely ASCII and CAST
…efan/elasticsearch into 31604_unary_string_functions
Pinging @elastic/es-search-aggs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some minor comments - otherwise LGTM.
//; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file shouldn't be updated.
@@ -0,0 +1,30 @@ | |||
bitLengthGroupByAndOrderBy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nitpick - put all functions into a functions
file as oppose to string-functions
.
@@ -0,0 +1,80 @@ | |||
stringAscii | |||
// tag::ascii |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the tag/end
is used in the docs - not sure whether this is the case here so they should be removed (since it's just noise).
SELECT SPACE("languages") s, COUNT(*) count FROM "test_emp" GROUP BY SPACE("languages") ORDER BY SPACE(languages); | ||
|
||
spaceGroupByAndOrderByWithCharLength | ||
SELECT SPACE("languages") s, COUNT(*) count, CAST(CHAR_LENGTH(SPACE("languages")) AS INT) cls FROM "test_emp" WHERE "languages" IS NOT NULL GROUP BY SPACE("languages") ORDER BY SPACE("languages"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
return new String(spaces); | ||
}), | ||
BIT_LENGTH((String s) -> s.getBytes(StandardCharsets.UTF_8).length * 8), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the getBytes
should be replaced by UnicodeUtils#calcUTF16toUTF8Length
instead.
|
||
@Override | ||
protected ScriptTemplate asScriptFrom(FieldAttribute field) { | ||
//TODO change this to use _source instead of the exact form (aka field.keyword for text fields) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an issue raised for this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we should do that. I mean, we should file an issue, but I think we should be very careful about using _source
in script queries.
@@ -57,6 +57,33 @@ public static String camelCaseToUnderscore(String string) { | |||
} | |||
return sb.toString().toUpperCase(Locale.ROOT); | |||
} | |||
|
|||
//CAMEL_CASE to camelCase | |||
public static String underscoreToLowerCamelCase(String string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm kinda of the fence with this method - maybe it's worth using a different convention but if we stick with it, have an updated comment (the one above is wrong) and some tests.
assertTrue(p instanceof Filter); | ||
Expression condition = ((Filter) p).condition(); | ||
SqlIllegalArgumentException ex = expectThrows(SqlIllegalArgumentException.class, () -> QueryTranslator.toQuery(condition, false)); | ||
assertEquals("Scalar functions (LTRIM(keyword)) not allowed (yet) as arguments for LIKE", ex.getMessage()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nitpick, the plural in functions
should be optional (it's easy to check the number of arguments and add it only if needed). Also the function is LTRIM
not LTRIM(keyword)
- namely functionName() not function().
Lastly I would pick a different function to check whether the proper name is returned (say CHAR_LENGTH
) without the casing.
@Override | ||
public DataType dataType() { | ||
//TODO investigate if a data type Long (BIGINT) wouldn't be more appropriate here | ||
return DataType.INTEGER; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PostgreSQL returns an int
so we're probably safe doing it. They aren't always right, but they usually are.
|
||
@Override | ||
protected ScriptTemplate asScriptFrom(FieldAttribute field) { | ||
//TODO change this to use _source instead of the exact form (aka field.keyword for text fields) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we should do that. I mean, we should file an issue, but I think we should be very careful about using _source
in script queries.
} | ||
|
||
@Override | ||
protected TypeResolution resolveType() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I've made changes to there I've been trying to make sure methods in these were either abstract
or final
. That sort of forces things to be the same as one another and not "almost the same". Like Char
doesn't feel like a real UnaryStringFunction
to me because it overrides this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nik9000 agree with you that it doesn't feel like a true unary string function, judging by the type of the parameter. And it would be an awkward move to override the method checking the parameter type for this reason. But in SQL, CHAR
and SPACE
(the only two functions that do not receive at least one string as parameter) are considered to be string functions probably because they create text in one form or another, irrespective of the parameters count and type. Do you think it's worth taking these two out of the UnaryStringFunction base class and handle them differently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, yeah, I'd stop extending that class. Or at least see what that looks like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nik9000 I've made the changes and introduced a new UnaryStringIntFunction
class to handle the Char and Space functions. Let me know what you think.
Changed BIT_LENGTH function way of calculating the bits length.
…rameter a non-text (numeric): classes structure change, functionality remains the same.
@costin I've addressed your concerns. |
Thanks |
…_string_functions
…1874) Added support for ASCII, BIT_LENGTH, CHAR, CHAR_LENGTH, LCASE, LENGTH, LTRIM, RTRIM, SPACE, UCASE functions. Wherever Painless scripting is necessary (WHERE conditions, ORDER BY etc), those scripts are being used.
Pushed to 6.x, as well. |
* master: [TEST] Mute SlackMessageTests.testTemplateRender Docs: Explain closing the high level client [ML] Re-enable memory limit integration tests (#31328) [test] disable packaging tests for suse boxes Add nio transport to security plugin (#31942) XContentTests : Insert random fields at random positions (#30867) Force execution of fetch tasks (#31974) Fix unreachable error condition in AmazonS3Fixture (#32005) Tests: Fix SearchFieldsIT.testDocValueFields (#31995) Add Expected Reciprocal Rank metric (#31891) [ML] Get ForecastRequestStats doc in RestoreModelSnapshotIT (#31973) SQL: Add support for single parameter text manipulating functions (#31874) [ML] Ensure immutability of MlMetadata (#31957) Tests: Mute SearchFieldsIT.testDocValueFields() muted tests due to #31940 Work around reported problem in eclipse (#31960) Move build integration tests out of :buildSrc project (#31961) Tests: Remove use of joda time in some tests (#31922) [Test] Reactive 3rd party tests on CI (#31919) SQL: Support for escape sequences (#31884) SQL: HAVING clause should accept only aggregates (#31872) Docs: fix typo in datehistogram (#31972) Switch url repository rest tests to new style requests (#31944) Switch reindex tests to new style requests (#31941) Docs: Added note about cloud service to installation and getting started [DOCS] Removes alternative docker pull example (#31934) Add Snapshots Status API to High Level Rest Client (#31515) ingest: date_index_name processor template resolution (#31841) Test: fix null failure in watcher test (#31968) Switch test framework to new style requests (#31939) Switch low level rest tests to new style Requests (#31938) Switch high level rest tests to new style requests (#31937) [ML] Mute test failing due to Java 11 date time format parsing bug (#31899) [TEST] Mute SlackMessageTests.testTemplateRender Fix assertIngestDocument wrongfully passing (#31913) Remove unused reference to filePermissionsCache (#31923) rolling upgrade should use a replica to prevent relocations while running a scroll HLREST: Bundle the x-pack protocol project (#31904) Increase logging level for testStressMaybeFlush Added lenient flag for synonym token filter (#31484) [X-Pack] Beats centralized management: security role + licensing (#30520) HLRest: Move xPackInfo() to xPack().info() (#31905) Docs: add security delete role to api call table (#31907) [test] port archive distribution packaging tests (#31314) Watcher: Slack message empty text (#31596) [ML] Mute failing DetectionRulesIT.testCondition() test Fix broken NaN check in MovingFunctions#stdDev() (#31888) Date: Add DateFormatters class that uses java.time (#31856) [ML] Switch native QA tests to a 3 node cluster (#31757) Change trappy float comparison (#31889) Fix building AD URL from domain name (#31849) Add opaque_id to audit logging (#31878) re-enable backcompat tests add support for is_write_index in put-alias body parsing (#31674) Improve release notes script (#31833) [DOCS] Fix broken link in painless example Handle missing values in painless (#30975) Remove the ability to index or query context suggestions without context (#31007) Ingest: Enable Templated Fieldnames in Rename (#31690) [Docs] Fix typo in the Rollup API Quick Reference (#31855) Ingest: Add ignore_missing option to RemoveProc (#31693) Add template config for Beat state to X-Pack Monitoring (#31809) Watcher: Add ssl.trust email account setting (#31684) Remove link to oss-MSI (#31844) Painless: Restructure Definition/Whitelist (#31879) HLREST: Add x-pack-info API (#31870)
* 6.x: Force execution of fetch tasks (#31974) [TEST] Mute SlackMessageTests.testTemplateRender Docs: Explain closing the high level client [test] disable java packaging tests for suse XContentTests : Insert random fields at random positions (#30867) Add Get Snapshots High Level REST API (#31980) Fix unreachable error condition in AmazonS3Fixture (#32005) [6.x][ML] Ensure immutability of MlMetadata (#31994) Add Expected Reciprocal Rank metric (#31891) SQL: Add support for single parameter text manipulating functions (#31874) muted tests due to #31940 Work around reported problem in eclipse (#31960) Move build integration tests out of :buildSrc project (#31961) [Test] Reactive 3rd party tests on CI (#31919) Fix assertIngestDocument wrongfully passing (#31913) (#31951) SQL: Support for escape sequences (#31884) SQL: HAVING clause should accept only aggregates (#31872) Docs: fix typo in datehistogram (#31972) Switch url repository rest tests to new style requests (#31944) Switch reindex tests to new style requests (#31941) Switch test framework to new style requests (#31939) Docs: Added note about cloud service to installation and getting started [DOCS] Removes alternative docker pull example (#31934) ingest: date_index_name processor template resolution (#31841) Test: fix null failure in watcher test (#31968) Watcher: Slack message empty text (#31596) Switch low level rest tests to new style Requests (#31938) Switch high level rest tests to new style requests (#31937) HLREST: Bundle the x-pack protocol project (#31904) [ML] Mute test failing due to Java 11 date time format parsing bug (#31899) Increase logging level for testStressMaybeFlush rolling upgrade should use a replica to prevent relocations while running a scroll [test] port archive distribution packaging tests (#31314) HLRest: Move xPackInfo() to xPack().info() (#31905) Increase logging level for :qa:rolling-upgrade Backport: Add template config for Beat state to X-Pack Monitoring (#31809) (#31893) Fix building AD URL from domain name (#31849) Fix broken NaN check in MovingFunctions#stdDev() (#31888) Change trappy float comparison (#31889) Add opaque_id to audit logging (#31878) add support for is_write_index in put-alias body parsing (#31674) Ingest: Enable Templated Fieldnames in Rename (#31690) (#31896) Ingest: Add ignore_missing option to RemoveProc (#31693) (#31892) [Docs] Fix typo in the Rollup API Quick Reference (#31855) Watcher: Add ssl.trust email account setting (#31684) [PkiRealm] Invalidate cache on role mappings change (#31510) [Security] Check auth scheme case insensitively (#31490) HLREST: Add x-pack-info API (#31870) Remove link to oss-MSI (#31844) Painless: Restructure Definition/Whitelist (#31879)
Adds support for unary string manipulating functions:
Part of #31604.