Skip to content

Stats actions should discard intermediate state on cancellation #82337

Closed
@DaveCTurner

Description

@DaveCTurner

Most stats actions fan out to various nodes in the cluster and collect per-node responses which are then aggregated into the final result. The per-node responses may sometimes be many MBs in size. If the client cancels the request by closing its connection then we broadcast the cancellation to all the target nodes and wait for them to respond with a TaskCancelledException before discarding the intermediate results. It's possible for one of the target nodes to take many minutes to respond to the cancellation if, for instance, it is overwhelmed by GC activity. In that case we retain many MBs of unnecessary intermediate state for many minutes.

We should instead react to the cancellation by immediately discarding the intermediate results and dropping any further results that arrive to free up this unnecessary memory usage. One possible way to do this would be to allow a CancellableTask to accumulate listeners which are completed by CancellableTask#onCancelled().

Relates #55550 (comment) which contains a list of some of the more important cases of this to address.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions