Description
The container_memory_usage_bytes
metric is already exported. For linux containers, this ultimately comes from the cgroupfs memory.usage_in_bytes
file. This metric is traditionally documented as being the sum of RSS, cache, and swap memory (memory.stat:rss
, memory.stat:cache
, and memory.stat:swap
respectively) and these individual components are also already exported to Prometheus as container_memory_rss
, container_memory_cache
and container_memory_swap
.
However, on recent kernels (e.g. 4.4+ and probably earlier) when Memory CGroup Kernel Accounting is enabled (the default), the memory.kmem.usage_in_bytes
value is also added to the memory.usage_in_bytes
aggregate value.
Kernel memory accounting is enabled for all memory cgroups by default. But
it can be disabled system-wide by passing cgroup.memory=nokmem to the kernel
at boot time. In this case, kernel memory will not be accounted at all.
https://github.com/torvalds/linux/blob/v4.15/Documentation/cgroup-v1/memory.txt#L283
The main "kmem" counter is fed into the main counter, so kmem charges will
also be visible from the user counter.
https://github.com/torvalds/linux/blob/v4.15/Documentation/cgroup-v1/memory.txt#L291-L292
At the moment, querying the sum of the container_memory_rss
, container_memory_cache
and container_memory_swap
metrics does not give the same result as querying container_memory_usage_bytes
alone, and the difference is most often attributable to the missing kernel memory value exactly (with some small % variation for a small subset of containers).
It would be helpful if cadvisor exposed this kernel memory subcomponent to Prometheus just as it already does with the rss, cache, and swap subcomponents so that the breakdown of a container's overall memory usage over time can be better understood.