Skip to content

Request-level circuit breaker support on coordinating nodes #37182

Closed
@markharwood

Description

@markharwood

Currently we do not have circuit breaker support for search requests executed on the coordinating node. We have multi-phase reduction which should help avoid OOMs but it is still possible to have abusive queries taking a node down.
A recent example OOM was caused by date histograms with 5 minute intervals executed across many time-based indices. Each of the data nodes failed to trip a circuit breaker because they were only seeing a small part of the final result. The multi-phase reduction did nothing to reduce the final number of buckets required and the final OOM occurred while rendering results in toXContent. This scenario was exacerbated by the fact there was a top-level terms agg for hostname under which there were the date histograms.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions