Skip to content

Bucket Aggregation size setting should never throw too_many_buckets_exception if size is less than respect search.max_buckets #51559

Closed
@niemyjski

Description

@niemyjski

Elasticsearch version (bin/elasticsearch --version): 7.5.2

Plugins installed: []

JVM version (java -version): 7.5.2

OS version (uname -a if on a Unix-like system): docker

Description of the problem including expected versus actual behavior:
Bucket Aggregation size setting should never throw too_many_buckets_exception if size is less than respect search.max_buckets. If I have a simple terms aggregation (no nesting) then I'd think it would always return the max number of buckets as determined by the size property. I get that for accuracy more records might be returned from various shards which may be over the 10k limit but my end result returned should be <= 10k as defined by the size property.

TLDR: I don't care what queries happen behind the scenes to get me my 10k buckets. All I care about is I get my 10k buckets which is valid size as it's <= search.max_buckets :-)

Steps to reproduce:
Assuming I have more than 10k unique document ids..

POST /events/_search
{
  "aggs": {
    "terms_id": {
      "meta": {
        "@field_type": "keyword"
      },
      "terms": {
        "field": "id",
        "size": 10000
      }
    }
  }

Should return 10k unique buckets with an id in each bucket..

What happens is:

{
  "error": {
    "root_cause": [
      {
        "type": "too_many_buckets_exception",
        "reason": "Trying to create too many buckets. Must be less than or equal to: [10000] but was [10001]. This limit can be set by changing the [search.max_buckets] cluster level setting.",
        "max_buckets": 10000
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "events",
        "node": "8v-46gnFQWGp2EsalBzwYw",
        "reason": {
          "type": "too_many_buckets_exception",
          "reason": "Trying to create too many buckets. Must be less than or equal to: [10000] but was [10001]. This limit can be set by changing the [search.max_buckets] cluster level setting.",
          "max_buckets": 10000
        }
      }
    ]
  },
  "status": 503
}

Reasoning:

I'd love to learn more why this happens, if we could get a detailed response on this choice that would be greatly appreciated. I know I wasn't the only one as it was discussed here too: https://discuss.elastic.co/t/large-aggregate-too-many-buckets-exception/189091/15

If I'm understanding this issue correctly, wouldn't the following scenario also throw this error. Let's say I have two shards and shard 1 contains 10k+ unique ids and shard2 contains 10k+ different unique ids. The combination of both of them being queried would return 20k buckets that need to be merged down into the respected bucket size of 10k. But creating 1 bucket over the max behind the scenes would throw this error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions