Improve scalability of Kubernetes module in metricbeat

Context
During my investigation for the the PR [32539](https://github.com/elastic/beats/pull/32539) I have noticed that there might be room for performance improvements in the Kubernetes module of metricbeat.

Each metricbeat instance is storing metrics about all the nodes in the Kubernetes cluster but only metrics about pods and containers on the same node where that instance of metricbeat is running. This replicates how the previous expiring cache worked but it is now more evident and can have detrimental effect in clusters with lots of nodes. This is because, with lots of nodes we might end up wasting lots of memory on unused metrics from other nodes. This behaviour is due to how the watcher notifies events from Kubernetes and it wasn't modified by the afore mentioned PR. 

Possible solution is for each metricbeat to filter out events generated by other nodes than the one where it is running. This should simplify the MetricRepo API since we wouldn't need to handle the deletion of nodes but only events from Pods and Containers.

During the same investigation, I noticed that when a Pod is deleted, it first calls the `update` function (to add its metrics again) to be deleted few seconds after. I am not sure if this is intended since the status of the pod is `Terminating` already. Also I noticed that the call to `deletePod` is executing twice. This might be because there is more than 1 watcher or because the code is shared between multiple metricsets.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve scalability of Kubernetes module in metricbeat #32662

gsantoro
openedon Aug 11, 2022

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve scalability of Kubernetes module in metricbeat #32662

Description

gsantoroopenedon Aug 11, 2022

Metadata