Headlamp cluster metrics are not showing the proper values #2043

mariogkds · 2024-06-16T07:06:51Z

Hello, i am a new user, i really liked the project.

I am having some problems with the cluster wide metrics that are show on the dashboard:

I am using kube-prometheus-stack to handle prometheus and grafana and i am using prometheus-adapter for the metrics API.

To get the headlamp to even show anything i had to add a few settings to the chart's values:

kube-prometheus-stack

    kubelet:
      serviceMonitor:
        metricRelabelings:
          - action: replace
            sourceLabels:
              - node
            targetLabel: instance
    prometheus-node-exporter:
      prometheus:
        monitor:
          attachMetadata:
            node: true
          relabelings:
            - sourceLabels:
                - __meta_kubernetes_endpoint_node_name
              targetLabel: node
              action: replace
              regex: (.+)
              replacement: ${1}
          metricRelabelings:
            - action: replace
              regex: (.*)
              replacement: $1
              sourceLabels:
                - __meta_kubernetes_pod_node_name
              targetLabel: kubernetes_node

prometheus-adapter (which is normal to get the metrics apis)

      resource:
        cpu:
          containerQuery: |
            sum by (<<.GroupBy>>) (
              rate(container_cpu_usage_seconds_total{container!="",<<.LabelMatchers>>}[3m])
            )
          nodeQuery: |
            sum  by (<<.GroupBy>>) (
              rate(node_cpu_seconds_total{mode!="idle",mode!="iowait",mode!="steal",<<.LabelMatchers>>}[3m])
            )
          resources:
            overrides:
              node:
                resource: node
              namespace:
                resource: namespace
              pod:
                resource: pod
          containerLabel: container
        memory:
          containerQuery: |
            sum by (<<.GroupBy>>) (
              avg_over_time(container_memory_working_set_bytes{container!="",<<.LabelMatchers>>}[3m])
            )
          nodeQuery: |
            sum by (<<.GroupBy>>) (
              avg_over_time(node_memory_MemTotal_bytes{<<.LabelMatchers>>}[3m])
              -
              avg_over_time(node_memory_MemAvailable_bytes{<<.LabelMatchers>>}[3m])
            )
          resources:
            overrides:
              node:
                resource: node
              namespace:
                resource: namespace
              pod:
                resource: pod
          containerLabel: container
        window: 3m

Individual node's CPU values are correct, the memory value is correct as well but the unit is different:

Is this a headlamp problem or this a prometheus(me) problem?

Thanks for the help and the project have a nice day.

joaquimrocha · 2024-06-18T13:08:44Z

Hi @mariogkds . Thanks for the report. This looks like a unit conversion issue.
We will take a look.

sarg3nt · 2024-09-05T20:42:12Z

@joaquimrocha I'm seeing this in metrics for RAM in deployments and pods too. Probably other places as well?
Grafana and crictl report values correctly but headlamp is showing much more.
Example, the headlamp pod, in the Headlamp UI is showing 40 MB RAM being used but it's actually 20.76 MB according to Grafana and crictl So looks like about double.
CPU and network are correct.
Is this going to get fixed soon, it's confusing our users.
Headlamp 0.25.1

joaquimrocha · 2024-09-16T10:42:39Z

@sarg3nt Yes, we do want to fix this but haven't had the bandwidth yet. Let me try to get it in our pipeline for the next release.

skoeva · 2024-10-08T15:31:33Z

Hi @mariogkds @sarg3nt , thanks for raising these issues! Would you be able to provide the YAML (with any sensitive data redacted) for the problematic resources? Would be super helpful for testing ^^

joaquimrocha · 2024-10-15T12:08:43Z

Hi @mariogkds and @sarg3nt , we really want to address this issue but we haven't been able to reproduce. If you don't mind, please send us some sample YAML based on yours so @skoeva can take a look.

joaquimrocha added the bug Something isn't working label Jun 18, 2024

illume added prometheus Relating to prometheus and the prometheus plugin charts labels Jul 8, 2024

joaquimrocha added this to the v0.26.0 milestone Sep 16, 2024

joaquimrocha assigned skoeva Sep 16, 2024

skoeva linked a pull request Sep 17, 2024 that will close this issue

WIP: Fix cluster metrics unit conversion #2338

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Headlamp cluster metrics are not showing the proper values #2043

Headlamp cluster metrics are not showing the proper values #2043

mariogkds commented Jun 16, 2024

joaquimrocha commented Jun 18, 2024

sarg3nt commented Sep 5, 2024

joaquimrocha commented Sep 16, 2024

skoeva commented Oct 8, 2024

joaquimrocha commented Oct 15, 2024

Headlamp cluster metrics are not showing the proper values #2043

Headlamp cluster metrics are not showing the proper values #2043

Comments

mariogkds commented Jun 16, 2024

kube-prometheus-stack

prometheus-adapter (which is normal to get the metrics apis)

joaquimrocha commented Jun 18, 2024

sarg3nt commented Sep 5, 2024

joaquimrocha commented Sep 16, 2024

skoeva commented Oct 8, 2024

joaquimrocha commented Oct 15, 2024