-
Notifications
You must be signed in to change notification settings - Fork 181
Description
Query Information
PPL Command/Query:
source=events
| timechart span=1m avg(cpu_usage)
Expected Result:
Timechart query returns all composite buckets up to the configured aggregation limit (default 1000), so full data is retrieved across time intervals. There are 20 timestamps in the events.json dataset, so it should return all 20 timestamp buckets.
Actual Result:
curl -X POST "localhost:9200/_plugins/_ppl/_explain" \
-H "Content-Type: application/json" \
-d '{
"query": "source=events | timechart span=1m avg(response_time)"
}'
{
"schema": [
{
"name": "$f2",
"type": "timestamp"
},
{
"name": "$f1",
"type": "double"
}
],
"datarows": [
[
"2024-07-01 00:00:00",
120.0
],
[
"2024-07-01 00:01:00",
85.0
],
[
"2024-07-01 00:02:00",
200.0
],
[
"2024-07-01 00:03:00",
150.0
],
[
"2024-07-01 00:04:00",
95.0
],
[
"2024-07-01 00:05:00",
75.0
],
[
"2024-07-01 00:06:00",
180.0
],
[
"2024-07-01 00:07:00",
110.0
],
[
"2024-07-01 00:08:00",
160.0
],
[
"2024-07-01 00:09:00",
90.0
]
],
"total": 10,
"size": 10
}%
{
"calcite": {
"logical": "LogicalSystemLimit(sort0=[$0], dir0=[ASC], fetch=[10000], type=[QUERY_SIZE_LIMIT])\n LogicalSort(sort0=[$0], dir0=[ASC])\n LogicalAggregate(group=[{1}], agg#0=[AVG($0)])\n LogicalProject(response_time=[$6], $f2=[SPAN($1, 1, 'm')])\n CalciteLogicalIndexScan(table=[[OpenSearch, events]])\n",
"physical": "EnumerableLimit(fetch=[10000])\n CalciteEnumerableIndexScan(table=[[OpenSearch, events]], PushDownContext=[[AGGREGATION->rel#203:LogicalAggregate.NONE.[](input=RelSubset#202,group={1},agg#0=AVG($0)), SORT->[0]],
OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"size\":0,\"timeout\":\"1m\",\"aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":10,\"sources\":[{\"$f2\":{\"date_histogram\":{\"field\":\"@timestamp\",\"missing_bucket\":true,\"missing_order\":\"last\",\"order\":\"asc\",\"fixed_interval\":\"1m\"}}}]},\"aggregations\":{\"$f1\":{\"avg\":{\"field\":\"response_time\"}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])\n"
}
}%
The query returns the first 10 timestamp buckets in the events.json dataset.
Without explicitly setting a composite aggregation size, OpenSearch defaults to size: 10 for buckets, which truncates results.
Dataset Information
Dataset/Schema Type
- OpenTelemetry (OTEL)
- Simple Schema for Observability (SS4O)
- Open Cybersecurity Schema Framework (OCSF)
- Custom (details below)
Index Mapping
{
"mappings": {
"properties": {
"@timestamp": { "type": "date" },
"host": { "type": "keyword" },
"service": { "type": "keyword" },
"response_time": { "type": "integer" },
"status_code": { "type": "integer" },
"bytes_sent": { "type": "integer" },
"cpu_usage": { "type": "float" },
"memory_usage": { "type": "float" },
"region": { "type": "keyword" },
"environment": { "type": "keyword" }
}
}
}Sample Data
There are only 3 lines provided here as an example but the number of timestamps in the dataset covers 20 seconds.
{ "index": {} }
{ "@timestamp": "2024-07-01T00:00:00Z", "host": "web-01", "service": "nginx", "response_time": 120, "status_code": 200, "bytes_sent": 1024, "cpu_usage": 45.2, "memory_usage": 67.8, "region": "us-east", "environment": "prod" }
{ "index": {} }
{ "@timestamp": "2024-07-01T00:01:00Z", "host": "web-02", "service": "nginx", "response_time": 85, "status_code": 200, "bytes_sent": 2048, "cpu_usage": 38.7, "memory_usage": 72.1, "region": "us-west", "environment": "prod" }
{ "index": {} }
{ "@timestamp": "2024-07-01T00:02:00Z", "host": "web-01", "service": "nginx", "response_time": 200, "status_code": 500, "bytes_sent": 512, "cpu_usage": 55.3, "memory_usage": 81.4, "region": "us-east", "environment": "prod" }
{ "index": {} }
{ "@timestamp": "2024-07-01T00:03:00Z", "host": "web-03", "service": "apache", "response_time": 150, "status_code": 200, "bytes_sent": 4096, "cpu_usage": 42.1, "memory_usage": 58.9, "region": "eu-west", "environment": "prod" }
...Bug Description
Issue Summary:
When executing an aggregate query without a by clause, the composite aggregation size defaults to 10 in the pushdown phase, even though AGGREGATION_BUCKET_SIZE is set to 1000.
This happens because OpenSearchAggregateIndexScanRule pushes down an aggregation without overriding the composite.size in AbstractCalciteIndexScan.pushDownSortIntoAggBucket.
After adding:
.size(org.opensearch.sql.opensearch.storage.script.aggregation.AggregationQueryBuilder.AGGREGATION_BUCKET_SIZE)
in pushDownSortIntoAggBucket inside AbstractCalciteIndexScan.java, size now defaults to 1000 and all 20 timestamps in my query are returned correctly.
However, this change causes CalciteExplainIT.testExplain to fail because the expected EXPLAIN JSON still contains "size": 10.
Steps to Reproduce:
- Use feat/timechart_cmd branch from the linked PR.
- Index the provided dataset into an OpenSearch cluster.
- Run the query.
Impact:
In my timechart command, this causes missing time buckets for certain queries, producing incomplete visualizations.
Timechart Draft PR #3993
Timechart RFC #3965
Environment Information
OpenSearch Version:
Opensearch 3.2.0
Additional Details:
Change affects planner pushdown rules, specifically OpenSearchAggregateIndexScanRule and AbstractCalciteIndexScan.
Question: Should the integration test be updated to expect 1000 instead of 10, or should we keep 10 and handle the timestamps returned for timechart differently?
Screenshot:
This is the line of code I changed to increase composite bucket size to 1000. To reproduce the limit of 10, revert the code change and the query results will be cut off at 10.

Metadata
Metadata
Assignees
Labels
Type
Projects
Status