-
Notifications
You must be signed in to change notification settings - Fork 176
Support Streamstats
command with calcite
#4297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Xinyu Hao <haoxinyu@amazon.com>
Signed-off-by: Xinyu Hao <haoxinyu@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks for your contribution. I left some comments for minor suggestions.
Also, please resolve the conflicts
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
Outdated
Show resolved
Hide resolved
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Xinyu Hao <haoxinyu@amazon.com>
Signed-off-by: Xinyu Hao <haoxinyu@amazon.com>
Signed-off-by: Xinyu Hao <haoxinyu@amazon.com>
Signed-off-by: Xinyu Hao <haoxinyu@amazon.com>
Signed-off-by: Xinyu Hao <haoxinyu@amazon.com>
Signed-off-by: Xinyu Hao <haoxinyu@amazon.com>
Signed-off-by: Xinyu Hao <haoxinyu@amazon.com>
merge to check CI test
Signed-off-by: Xinyu Hao <haoxinyu@amazon.com>
Signed-off-by: Xinyu Hao <haoxinyu@amazon.com>
Signed-off-by: Xinyu Hao <haoxinyu@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haven't review the code details, just need to refactor the user doc and confirm the behaviour
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some testin
CalciteExplainIT.java`
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteStreamstatsCommandIT.java
Show resolved
Hide resolved
|
||
- `stats command <cmd/stats.rst>`_ | ||
|
||
- `streamstats command <cmd/streamstats.rst>`_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add streamstats.rst to category.json as well
GLOBAL: 'GLOBAL'; | ||
RESET_BEFORE: 'RESET_BEFORE'; | ||
RESET_AFTER: 'RESET_AFTER'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these three seems missing in keywordsCanBeId
import org.opensearch.client.Request; | ||
import org.opensearch.sql.ppl.PPLIntegTestCase; | ||
|
||
public class CalciteStreamstatsCommandIT extends PPLIntegTestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add some tests with other command
eventstats + streamstats:
source = t | eventstats ... | streamstats ...
sort + streamstats:
source = t | sort ... | streamstats ...
left join with streamstats:
source = t | left join left=l right=r on l.key=r.key [ source = tt | streamstats ... ]
streamstats in in-subsearch:
source = t | where a in [ source = tt | streamstats ...]
Description
Support
Streamstats
command with arguments below:Implementation Details
The implementation handles three distinct execution paths, depending on the combination of
window
,global
,group
, andreset
arguments:Why This Design
Default path can rely on native SQL OVER because there is no global/window-with-reset complexity.
Specific SQL limitations:
Native SQL OVER clauses cannot implement per-group sliding windows over the entire stream . However, we want to combine a global sequence with group-level partitioning. In SQL, a window is either global without a BY clause or partitioned by a group with a BY clause; you cannot have a “global sequence plus per-group sliding frame” in one OVER.
ROWS BETWEEN ... PRECEDING
cannot take a variable (it only supports constants like1 PRECEDING
,1+1 PRECEDING
).Global + window + group
want "per-group sliding windows over entire stream," but SQL window functions do not allow fully flexible frame boundaries combined with lateral joins. Hence, we simulate it viaROW_NUMBER() + correlated join + aggregate
.Reset path introduces segment semantics (
seg_id
) that cannot be represented natively in SQL OVER clauses. Each reset creates a new frame partition. By default, reset behaves like a global window, but when grouping exists, it applies per-group aggregation within each reset segment. So I use helper columns (before_flag, after_flag, seg_id) and a correlated join ensures correctness.1. Default Path (No global in use / no reset)
2. global=true + window > 0 + group exists
To support sliding windows over the entire stream with optional grouping:
3. Reset Path (reset_before / reset_after defined)
When
reset_before
orreset_after
exist:Related Issues
Resolves #4207
Check List
--signoff
or-s
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.