-
Notifications
You must be signed in to change notification settings - Fork 181
Support expand command with Calcite
#3745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
|
TODOs:
|
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExpandCommandIT.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
| ============ | ||
| (From 3.1.0) | ||
|
|
||
| Use the ``expand`` command on a nested array field to transform a single |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nested array -> array? why emphasized nested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because the boundary between array and object is blurry in OpenSearch. For example, a string field can store a string or an array of string. But there isn't an array type of string. Array only exists for nested type, where it stores an array of structs.
The current implementation does not support expanding an array of string stored in a string field since it will only read the string field as string, it doesn't know whether it is an array of string or a single string at the time of generating logical plans. That's why I specifically mentioned nested array.
|
|
||
| expand <field> [as alias] | ||
|
|
||
| * field: The field to be expanded (exploded). Currently only nested arrays are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently only nested arrays are supported?
It is not limitation of Expand command. Expand command does not aware of OpenSearch nested data type, right?
Is it because PPL engine map OpenSerach nested data type to PPL Array data type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, expand command does not array of the nested type. This limitation originates from the fact that only nested fields are read as arrays in visitExpand in CalciteRelNodeVisitor. Primitive fields storing arrays are read as primitive types.
| schema("age", "bigint"), | ||
| schema("id", "bigint"), | ||
| schema("address", "struct")); | ||
| verifyDataRows( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: test should be specific, only focus on empty array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
| // The element type of struct and array is currently set to ANY. | ||
| // We set them using the runtime type as a workaround. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The element type of struct and array is currently set to ANY.
It is short-term solution, right? if so, add TODO, and issue link?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Added. Thanks for pointing this out.
| // be used by the right side to correlate with the left side. | ||
| // Using left join to keep the records where the array field is empty. The corresponding | ||
| // field in the result will be null. | ||
| relBuilder.correlate(JoinRelType.LEFT, correlVariable.get().id, List.of(arrayFieldRex)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create an issue to track enforce limitation on Expand command.
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…orkaround Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
|
Just to confirm: when the field to expand is an empty array, should I keep it in the result? @LantaoJin @qianheng-aws @penghuo The current implementation keeps it, with the corresponding expanded value set to E.g. for expanding employees results in
|
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
* Create scalffolds for implementing expand command Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * WIP: Implementing expand command Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Add array json and index mapping Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * WIP: Implementing expand command Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Implement a minimal viable version of expand Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Support expand with alias Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Add doc for expand Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * WIP: add unit tests for expand command Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Remove unused logical expand & add test cases Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Use left join in expand to keep documents where their expanded array field is empty Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Add unit test for expand command Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Update expand doc format Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Fix: delete test doc with refresh Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Improve expand documentation Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Fix typos Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Add a version section to expand documentation Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Make expand empty array IT more specific Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Add a limitation section in expand doc & link a issue tracker for a workaround Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Tweak expand command doc Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> * Test expand null field Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> --------- Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> Signed-off-by: xinyual <xinyual@amazon.com>
Description
Support
expandcommand with Calcite.Note:
expandonly explodes nested arrays, not maps.Related Issues
Resolves #3711
Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.