Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[metricbeat/kubernetes] Remove fallback to the Node's limit for the pod 'usage.limit.pct' metrics #40029

Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ https://github.com/elastic/beats/compare/v8.8.1\...main[Check the HEAD diff]

- Setting period for counter cache for Prometheus remote_write at least to 60sec {pull}38553[38553]
- Add support of Graphite series 1.1.0+ tagging extension for statsd module. {pull}39619[39619]
- Remove fallback to the node limit for the `kubernetes.pod.cpu.usage.limit.pct` and `kubernetes.pod.memory.usage.limit.pct` metrics calculation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that This should me moved under the Breaking Changes section

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this, so maybe it is a rendering issue of the page. I though is only under Metricbeat
Screenshot 2024-06-27 at 4 38 35 PM


*Osquerybeat*

Expand Down
52 changes: 18 additions & 34 deletions metricbeat/module/kubernetes/pod/data.go
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,9 @@ func eventMapping(content []byte, metricsRepo *util.MetricsRepo, logger *logp.Lo
podId := util.NewPodId(pod.PodRef.Namespace, pod.PodRef.Name)
podStore := nodeStore.GetPodStore(podId)

allContainersCPULimitsDefined := true
allContainersMemoryLimitsDefined := true

for _, container := range pod.Containers {
usageNanoCores += container.CPU.UsageNanoCores
usageMem += container.Memory.UsageBytes
Expand All @@ -70,17 +73,22 @@ func eventMapping(content []byte, metricsRepo *util.MetricsRepo, logger *logp.Lo
containerStore := podStore.GetContainerStore(container.Name)
containerMetrics := containerStore.GetContainerMetrics()

containerCoresLimit := nodeCores
if containerMetrics.CoresLimit != nil {
containerCoresLimit = containerMetrics.CoresLimit.Value
// podCoreLimit and podMemLimit are defined only if all of Pod containers have a limit defined, otherwise the limit will be set to 0
if allContainersCPULimitsDefined && containerMetrics.CoresLimit == nil {
allContainersCPULimitsDefined = false
podCoreLimit = 0.0
}
if allContainersCPULimitsDefined {
podCoreLimit += containerMetrics.CoresLimit.Value
}

containerMemLimit := nodeMem
if containerMetrics.MemoryLimit != nil {
containerMemLimit = containerMetrics.MemoryLimit.Value
if allContainersMemoryLimitsDefined && containerMetrics.MemoryLimit == nil {
allContainersMemoryLimitsDefined = false
podMemLimit = 0.0
}
if allContainersMemoryLimitsDefined {
podMemLimit += containerMetrics.MemoryLimit.Value
}
podCoreLimit += containerCoresLimit
podMemLimit += containerMemLimit
}

podEvent := mapstr.M{
Expand Down Expand Up @@ -132,32 +140,7 @@ func eventMapping(content []byte, metricsRepo *util.MetricsRepo, logger *logp.Lo
kubernetes2.ShouldPut(podEvent, "start_time", pod.StartTime, logger)
}

// NOTE:
// - `podCoreLimit > `nodeCores` is possible if a pod has more than one container
// and at least one of them doesn't have a limit set. The container without limits
// inherit a limit = `nodeCores` and the sum of all limits for all the
// containers will be > `nodeCores`. In this case we want to cap the
// value of `podCoreLimit` to `nodeCores`.
// - `nodeCores` can be 0 if `state_node` and/or `node` metricsets are disabled.
// - if `nodeCores` == 0 and podCoreLimit > 0` we need to avoid that `podCoreLimit` is
// incorrectly overridden to 0. That's why we check for `nodeCores > 0`.
if nodeCores > 0 && podCoreLimit > nodeCores {
podCoreLimit = nodeCores
}

// NOTE:
// - `podMemLimit > `nodeMem` is possible if a pod has more than one container
// and at least one of them doesn't have a limit set. The container without limits
// inherit a limit = `nodeMem` and the sum of all limits for all the
// containers will be > `nodeMem`. In this case we want to cap the
// value of `podMemLimit` to `nodeMem`.
// - `nodeMem` can be 0 if `state_node` and/or `node` metricsets are disabled.
// - if `nodeMem` == 0 and podMemLimit > 0` we need to avoid that `podMemLimit` is
// incorrectly overridden to 0. That's why we check for `nodeMem > 0`.
if nodeMem > 0 && podMemLimit > nodeMem {
podMemLimit = nodeMem
}

// `nodeCores` can be 0 if `state_node` and/or `node` metricsets are disabled
if nodeCores > 0 {
kubernetes2.ShouldPut(podEvent, "cpu.usage.node.pct", float64(usageNanoCores)/1e9/nodeCores, logger)
}
Expand All @@ -167,6 +150,7 @@ func eventMapping(content []byte, metricsRepo *util.MetricsRepo, logger *logp.Lo
}

if usageMem > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we dont check if podMemLimit > 0, so this means that memory.usage.node.pct will always be reported for a pod right?

But what we need is to report it only in case that podMemLimit == 0

Otherwise this Kibana UI will always use the node limits

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we dont check if podMemLimit > 0, so this means that memory.usage.node.pct will always be reported for a pod right?

no, memory.usage.node.pct will be reported only in case nodeMem > 0 (nodeMem can be 0 if state_node and/or node metricsets are disabled)

But what we need is to report it only in case that podMemLimit == 0

why? the point is to split memory.usage.node.pct and memory.usage.limit.pct, so if usageMem and nodeMem are available - memory.usage.node.pct can be calculated
similarly: if all containers have memory limit defined, memory.usage.limit.pct can be calculated as usageMem/sum(containers_mem_limit)

Otherwise this Kibana UI will always use the node limits

I think Kibana issue should be covered separately by elastic/kibana#184984

Copy link
Contributor

@gizas gizas Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I had a second look here. So this change wont break the existing UI, because kibana checks kubernetes.pod.memory.usage.limit.pct > 0.0. As long as this metric is populated then we dont care for the kubernetes.pod.memory.usage.node.pct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, it seems to not work this way
Screenshot 2024-07-02 at 00 09 54

usage.node.pct is > 1% for the selected pod:
Screenshot 2024-07-02 at 00 11 07

seems this script doesn't work as expected

// `nodeMem` can be 0 if `state_node` and/or `node` metricsets are disabled
if nodeMem > 0 {
kubernetes2.ShouldPut(podEvent, "memory.usage.node.pct", float64(usageMem)/nodeMem, logger)
}
Expand Down
31 changes: 20 additions & 11 deletions metricbeat/module/kubernetes/pod/pod_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
package pod

import (
"io/ioutil"
"io"
"os"
"testing"

Expand Down Expand Up @@ -67,6 +67,7 @@ func (s *PodTestSuite) SetupTest() {
s.NodeMetrics.MemoryAllocatable = util.NewFloat64Metric(146227200)

s.ContainerMetrics = util.NewContainerMetrics()
s.ContainerMetrics.CoresLimit = util.NewFloat64Metric(0.5)
s.ContainerMetrics.MemoryLimit = util.NewFloat64Metric(14622720)

s.AnotherContainerMetrics = util.NewContainerMetrics()
Expand All @@ -77,7 +78,7 @@ func (s *PodTestSuite) ReadTestFile(testFile string) []byte {
f, err := os.Open(testFile)
s.NoError(err, "cannot open test file "+testFile)

body, err := ioutil.ReadAll(f)
body, err := io.ReadAll(f)
s.NoError(err, "cannot read test file "+testFile)

return body
Expand All @@ -98,7 +99,7 @@ func (s *PodTestSuite) TestEventMapping() {
// calculated pct fields:
"cpu.usage.nanocores": 11263994,
"cpu.usage.node.pct": 0.005631997,
"cpu.usage.limit.pct": 0.005631997,
"cpu.usage.limit.pct": 0.022527988,

"memory.usage.bytes": 1462272,
"memory.usage.node.pct": 0.01,
Expand All @@ -124,8 +125,10 @@ func (s *PodTestSuite) TestEventMappingWithZeroNodeMetrics() {

cpuMemoryTestCases := map[string]interface{}{
"cpu.usage.nanocores": 11263994,
"cpu.usage.limit.pct": 0.022527988,

"memory.usage.bytes": 1462272,
"memory.usage.limit.pct": 0.1,
"memory.working_set.limit.pct": 0.09943977591036414,
}

Expand All @@ -144,6 +147,7 @@ func (s *PodTestSuite) TestEventMappingWithNoNodeMetrics() {

cpuMemoryTestCases := map[string]interface{}{
"cpu.usage.nanocores": 11263994,
"cpu.usage.limit.pct": 0.022527988,

"memory.usage.bytes": 1462272,
"memory.usage.limit.pct": 0.1,
Expand All @@ -153,7 +157,7 @@ func (s *PodTestSuite) TestEventMappingWithNoNodeMetrics() {
s.RunMetricsTests(events[0], cpuMemoryTestCases)
}

func (s *PodTestSuite) TestEventMappingWithMultipleContainers() {
func (s *PodTestSuite) TestEventMappingWithMultipleContainers_NodeAndOneContainerLimits() {
s.MetricsRepo.DeleteAllNodeStore()

s.addNodeMetric(s.NodeMetrics)
Expand All @@ -168,18 +172,23 @@ func (s *PodTestSuite) TestEventMappingWithMultipleContainers() {
// Following comments explain what is the difference with the test `TestEventMapping`
"cpu.usage.nanocores": 22527988, // 2x usage since 2 container
"cpu.usage.node.pct": 0.011263994, // 2x usage since 2 container
"cpu.usage.limit.pct": 0.011263994, // same value as `cpu.usage.node.pct` since `podCoreLimit` = 2x nodeCores = `nodeCores` (capped value)
// "cpu.usage.limit.pct" is not reported, since AnotherCntainer does not contain CoresLimit

"memory.usage.bytes": 2924544, // 2x since 2 containers
"memory.usage.node.pct": 0.02, // 2x usage since 2 containers
"memory.usage.limit.pct": 0.02, // same value as `cpu.usage.node.pct` since 2 containers but only 1 with limit, podMemLimit = containerMemLimit + nodeLimit > nodeLimit = nodeLimit (capped value)
"memory.working_set.limit.pct": 0.019887955182072828, // similar concept to `memory.usage.limit.pct`. 2x usage but denominator 10x since nodeLimit = 10x containerMemLimit
"memory.usage.bytes": 2924544, // 2x since 2 containers
"memory.usage.node.pct": 0.02, // 2x usage since 2 containers
// "memory.usage.limit.pct" is not reported, since AnotherContainer metrics were not added
// "memory.working_set.limit.pct" is not reported, since AnotherContainer metrics were not added
}

s.RunMetricsTests(events[0], cpuMemoryTestCases)
}

func (s *PodTestSuite) TestEventMappingWithMultipleContainersWithAllMemLimits() {
// Scenario:
// Node metrics are defined,
// Pod contains 2 containers:
// - nginx with both cpu and memore limits defined
// - sidecar with memory limit defined
func (s *PodTestSuite) TestEventMappingWithMultipleContainers_AllMemLimits() {
s.MetricsRepo.DeleteAllNodeStore()

s.addNodeMetric(s.NodeMetrics)
Expand All @@ -195,7 +204,7 @@ func (s *PodTestSuite) TestEventMappingWithMultipleContainersWithAllMemLimits()
// Following comments explain what is the difference with the test `TestEventMapping
"cpu.usage.nanocores": 22527988, // 2x usage since 2 container
"cpu.usage.node.pct": 0.011263994, // 2x usage since 2 container
"cpu.usage.limit.pct": 0.011263994, // same value as `cpu.usage.node.pct` since `podCoreLimit` = 2x nodeCores = `nodeCores` (capped value)
// "cpu.usage.limit.pct" is not reported, since AnotherCntainer does not contain CoresLimit

"memory.usage.bytes": 2924544, // 2x since 2 containers
"memory.usage.node.pct": 0.02, // 2x usage since 2 containers
Expand Down
Loading