Skip to content

[Stack Monitoring] /api/v1/monitoring/clusters performance issues #191886

Open

Description

I'm looking at the overview cluster and /api/v1/monitoring/clusters is a huge bottleneck. It times out even on shorter time ranges. We're doing a ton of requests to ES but the bottleneck is mostly CPU in Kibana. The response size is ~30mb. Some issues I see:

  • for the request to get clusters, we use filter_path on source fields instead of _source. This helps with the response size but ES still has to get the source fields. We should just use _source.
  • for every cluster, we end up doing three requests. for the overview cluster this means that we are doing 206 * 3 = over 600 search requests in parallel. There is no throttling.
  • we don't use filter_path for any of the other requests. because _clusters is pretty big both ES and Kibana end up spending a lot of CPU cycles on JSON serialization and deserialization.

There's presumably a ton of other improvements we can do. The biggest one I suspect is that we don't want to load all the data up front. E.g. if we just list the clusters and some basic metrics that should be pretty fast. We also do three requests to this endpoint on page load (one for elasticsearch, two for all).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions