Skip to content

[Rollup] Ordering by sub-agg fails due to name rewriting #30467

Closed
@polyfractal

Description

@polyfractal

If an aggregation tries to order by a sub-aggregation, RollupSearch will fail because we rewrite the aggregation name internally (foo might be rewritten to foo.max.value). For example:

GET metricbeat_rollup/_rollup_search
{
  "size": 0,
  "aggs": {
    "2": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "1d"
      },
      "aggs": {
        "3": {
          "terms": {
            "field": "host",
            "size": 5,
            "order": {
              "1": "desc"
            }
          },
          "aggs": {
            "1": {
              "avg": {
                "field": "system.memory.actual.used.pct"
              }
            }
          }
        }
      }
    }
  }
}

Will throw:

{
  "error": {
    "root_cause": [
      {
        "type": "runtime_exception",
        "reason": "all shards failed"
      }
    ],
    "type": "runtime_exception",
    "reason": "all shards failed",
    "caused_by": {
      "type": "search_phase_execution_exception",
      "reason": "all shards failed",
      "phase": "query",
      "grouped": true,
      "failed_shards": [
        {
          "shard": 0,
          "index": "metricbeat_rollup",
          "node": "5cEuSOkeSKeI8R4oCTvPug",
          "reason": {
            "type": "aggregation_execution_exception",
            "reason": "Invalid aggregator order path [1]. Unknown aggregation [1]"
          }
        }
      ]
    }
  },
  "status": 500
}

This is trivially fixable for most of the metrics, we just need to do the rewriting. I'm not quite sure how this will work for averages though, since an avg == two metrics which we re-combine. We can't order by either agg individually since it only contains half the information, and if we order after the recombination we may be missing the "best" buckets.

The only saving grace is that ordering by sub-agg is discouraged anyway as count errors are unbounded, so this isn't really any worse. Ordering by sub-agg is as good as rolling dice :)

I've been wanting to look into modifying avg to accept "arbitrary" counts from the doc (instead of just incrementing the counter), so perhaps this is another motivation to do so.

/cc @cdahlqvist

Metadata

Metadata

Assignees

No one assigned

    Labels

    :StorageEngine/RollupTurn fine-grained time-based data into coarser-grained data>bugTeam:AnalyticsMeta label for analytical engine team (ESQL/Aggs/Geo)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions