Kubernetes pod-level information missing? #225

autolyticus · 2022-10-17T15:05:46Z

Bug description

I have deployed using the Helm chart for Scaphandre - v0.1.0, and looking at the documentation for the Prometheus exporter, I can see that I am supposed to get additional information and metadata about the pods.

However, I noticed that the scaph_process_power_consumption_microwatts metrics (as exposed from the scaphandre:8080 service, and accessing from /metrics port does not contain this information. Specifically, the container_scheduler is detected and set correctly as kubernetes, but the relevant metrics about kubernetes_pod_name and kubernetes_pod_namespace are missing!

scaph_process_power_consumption_microwatts{container_id="cri-containerd-3c5e5e3ab20f1aaa9cdb2f7b1c9d910aeb776f07a03fd22b3fce470ac7191739",exe="tini",cmdline="/usr/bin/tini-w-e143--/opt/kafka/bin/kafka-server-start.sh/tmp/strimzi.properties",container_scheduler="kubernetes",pid="15523"} 0

As you can see above, all that it contains is the information about the cmdline for the pod as well as the corresponding container id on containerd which is quite obscure and difficult to use.

This makes it very difficult to cross-correlate and understand which pod and which namespace is consuming more energy.

Am I doing something wrong in my deployment, or is there a gap in my expectation? Is this a bug on Scaphandre's side?

To Reproduce

Deploy latest Helm chart for Scaphandre onto the monitoring namespace
kubectl port-forward -n monitoring svc/scaphandre 3000:8080
Open http://localhost:3000/metrics on a web browser
Notice that scaph_process_power_consumption_microwatts metrics are having container_scheduler label set as kubernetes but are missing kubernetes_pod_name labels (like the example given above)

Expected behavior

Expected to see the pod metadata information along with the other information like cmdline, pid etc.

Screenshots

Environment

Linux - Debian Bullseye
Deployed on K3S 1.24.6

Linux distribution version [e.g. Ubuntu 20.04.1]
Kernel version (output of uname -r) [e.g. 5.4.0-54-generic]
5.10.0-18-amd64

Additional context

There is absolutely no errors or warning logs on the Scaphandre daemonset pods

The text was updated successfully, but these errors were encountered:

mmadoo · 2022-10-26T12:16:09Z

I do not have such issue on k8s v1.21.8 using latest helm chart on main branch.

autolyticus · 2022-10-26T13:48:13Z

@mmadoo Thanks for getting back regarding this issue. Could you please show me an example of the full metric line for scaph_process_power_consumption_microwatts which should be available on the Scaphandre's /metrics endpoint?

mmadoo · 2022-10-26T15:23:33Z

I have for instance a line with
scaph_process_power_consumption_microwatts{container_scheduler="kubernetes",container_id="2579a8513029f0fb26891985e49cf61802e26833d6b04ebaa2ca6191c6fba18a",kubernetes_node_name="workerdcbraindev04",kubernetes_pod_name="kubecost-grafana-6744d99888-4zhmd",pid="2767836",kubernetes_pod_namespace="kubecost",exe="grafana-server",cmdline="grafana-server--homepath=/usr/share/grafana--config=/etc/grafana/grafana.ini--packaging=dockercfg:default.log.mode=consolecfg:default.paths.data=/var/lib/grafanacfg:default.paths.logs=/var/log/grafanacfg:default.paths.plugins=/var/lib/grafana/pluginscfg:default.paths.provisioning=/etc/grafana/provisioning"} 92542

There are also some metrics without kubernetes_pod_namespace, but this is expected as its are not corresponding to kubenertes pods:
scaph_process_power_consumption_microwatts{container_runtime="containerd",pid="2767337",exe="containerd-shim",cmdline="/usr/bin/containerd-shim-runc-v2-namespacemoby-idddf2449e37f15f388eb68045a0851c0f49d0b554b2683113e951c6f32c9ac4e9-address/run/containerd/containerd.sock"} 0

autolyticus · 2022-10-26T15:31:19Z

@mmadoo Thank you! I have gotten an idea on what might be the issue (might be related to how K3S works). I am going to presently try re-deploying K3S using the distro provided containerd as opposed to its default embedded containerd.

I will try this and get back.

autolyticus · 2022-10-26T16:17:47Z

@mmadoo It seems to be working when launching K3S with the --docker flag. Thanks a lot for your help!

It looks like Scaphandre (obviously) has some expectations wrt the base distribution of Kubernetes. I'd imagine that there's no way for Scaphandre to detect K3S's embedded containerd socket to retrieve pod metadata.

So this is definitely not a bug on Scaphandre's side, and is a result of my Kubernetes distribution.

mickours · 2022-12-28T15:41:36Z

I just face the same issue.

k3s by default do not export kubernetes_pod_name, kubernetes_node_name and kubernetes_pod_namespace.

Also, the Helm chart does not work on the current default k3s version (v1.25) due to PodSecurityPolicy deprecation. See #246

So for others who want to setup Scaphandre on k3s:

use a version 1.24 (or less)
use the --docker flag (thanks @reisub0!)

Maybe it should be said in the doc somewhere but I'm not sure where...

rossf7 · 2023-01-03T12:32:00Z

@mickours @reisub0 If the metrics are present but the pod name and namespace labels are missing this is likely a problem mapping the pid to its container ID.

This is done using the cgroup file for the process e.g. /proc/1234/cgroup. Unfortunately the format varies depending on the container runtime and host OS. This will be why the --docker flag helps.

In my testing I found there were problems with cgroups v2 as the paths in the file are now relative. I noticed this as ubuntu 20.04 works but 22.04 does not.

@bpetit I tried adjusting the regular expression but I couldn't get it to work. Not certain but maybe the procfs crate also needs to be upgraded?

#250 has a fix for the problem with k8s 1.25.

autolyticus added the bug Something isn't working label Oct 17, 2022

autolyticus closed this as completed Oct 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubernetes pod-level information missing? #225

Kubernetes pod-level information missing? #225

autolyticus commented Oct 17, 2022 •

edited

Loading

mmadoo commented Oct 26, 2022

autolyticus commented Oct 26, 2022

mmadoo commented Oct 26, 2022

autolyticus commented Oct 26, 2022

autolyticus commented Oct 26, 2022

mickours commented Dec 28, 2022

rossf7 commented Jan 3, 2023

Kubernetes pod-level information missing? #225

Kubernetes pod-level information missing? #225

Comments

autolyticus commented Oct 17, 2022 • edited Loading

Bug description

To Reproduce

Expected behavior

Screenshots

Environment

Additional context

mmadoo commented Oct 26, 2022

autolyticus commented Oct 26, 2022

mmadoo commented Oct 26, 2022

autolyticus commented Oct 26, 2022

autolyticus commented Oct 26, 2022

mickours commented Dec 28, 2022

rossf7 commented Jan 3, 2023

autolyticus commented Oct 17, 2022 •

edited

Loading