Skip to content

Improve robustness of monitoring APIs #55550

Closed
@imotov

Description

@imotov

We observed some cases (#50241 for example) where a data node responding slowly can cause accumulation of ResponseContexts for indices:monitor/recovery[n], indices:monitor/stats[n], cluster:monitor/stats[n] and cluster:monitor/xpack/ml/job/stats/get[n] which correspond to _xpack/usage and _nodes/stats calls.

We would like to improve robustness of stats and usage call in case of a slowly responding data nodes by

  1. introducing timeout on stats and usage APIs and/or
  2. making stats and usage APIs tasks cancellable and cancel them if the REST client disconnects

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions