Task Manager health API - `workload.value.average_interval_ms`

## Problem

Currently, the [Task Manager health API](https://www.elastic.co/guide/en/kibana/master/task-manager-health-monitoring.html) returns statistics about Task Manager's configuration, workload, and runtime performance. The `workload.value.schedule` currently returns the 10 most frequent intervals for the scheduled tasks, but it does not return the intervals for all scheduled tasks, as this would be infeasible to return a "bucket" for every single interval:

<img width="587" alt="Screen Shot 2021-04-12 at 1 56 20 PM" src="https://user-images.githubusercontent.com/627123/114461736-f69fce80-9b96-11eb-980b-5fc675d04046.png">

As part of the autoscaling Kibana project, we would like to scale Kibana based on the task-capacity vs the scheduled task-load. One of the missing data-points for performing this calculation is the average interval for all scheduled tasks and this can't be inferred from the `workload.value.schedule` field.

## Solution

The task-manager health API should be updated to return the `workload.value.average_interval_ms` to support this autoscaling calculation.

Currently, each task document has a `task.schedule.interval` field; however, this is a `keyword` field and stores the intervals using Elasticsearch's date interval syntax: `10m` for 10 minutes,   100ms for 10 milliseconds. As a result, it's not possible to use the [Elasticsearch avg aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-avg-aggregation.html) on the `task.schedule.interval` field. Instead, a `task.schedule.interval_ms` field should be added so that the Elasticsearch avg aggregation can efficiently run.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Task Manager health API - `workload.value.average_interval_ms` #96893

kobelb
openedon Apr 12, 2021

Problem

Solution

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Task Manager health API - workload.value.average_interval_ms #96893

Description

kobelbopenedon Apr 12, 2021