[FEATURE] Span() / bin should support auto_date_histogram aggregation

**Is your feature request related to a problem?**

Currently, PPL's span() function with fixed intervals doesn't provide the same flexibility as OpenSearch's native auto_date_histogram aggregation. This leads to:

1. Suboptimal bucket distribution for time series analysis (as 1000 fixed)
2. Need for manual tuning of time intervals
3. Potential for too many or too few buckets
4. Inconsistent results between PPL and DSL queries

For example, this PPL query uses a fixed 1-hour interval:
```sql
source = custom-big5 
| eval range_bucket = case(
    `metrics.size` < -10, 'range_1', 
    `metrics.size` >= -10 and `metrics.size` < 10, 'range_2',
    `metrics.size` >= 10 and `metrics.size` < 100, 'range_3',
    `metrics.size` >= 100 and `metrics.size` < 1000, 'range_4',
    `metrics.size` >= 1000 and `metrics.size` < 2000, 'range_5',
    `metrics.size` >= 2000, 'range_6'
) 
| stats min(`metrics.tmin`) as tmin, avg(`metrics.size`) as tavg, max(`metrics.size`) as tmax 
    by range_bucket, span(`@timestamp`, 1h) as auto_span
```

While the equivalent DSL query uses auto_date_histogram for optimal bucket distribution:

```
{
  "aggs": {
    "tmax": {
      "range": {
        "field": "metrics.size",
        "ranges": [...]
      },
      "aggs": {
        "date": {
          "auto_date_histogram": {
            "field": "@timestamp",
            "buckets": 20
          }
        }
      }
    }
  }
}
```


**What solution would you like?**

Implement auto_date_histogram support in PPL with:

New function syntax for auto-bucketing:

```
source = custom-big5 
| stats count() by auto_span(`@timestamp`, buckets=20)
```

Key features:

* Dynamic interval selection based on data range
* Target bucket count specification
* Automatic interval rounding to human-readable values
* Consistent bucket count regardless of time range

**What alternatives have you considered?**

Enhanced span() function with auto mode:
```
span(`@timestamp`, 'auto', target_buckets=20)
```

Hybrid approach with min/max intervals:
```
adaptive_span(`@timestamp`, buckets=20, min_interval='1m', max_interval='1d')
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Span() / bin should support auto_date_histogram aggregation #4210

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Span() / bin should support auto_date_histogram aggregation #4210

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions