Fix backend-listen memory alert query (3x metric triple-counting)

Parent: #5371

## Problem

The Grafana alert "Backend-listen Memory usage is high" triple-counts `container_memory_working_set_bytes` because 3 Prometheus kubelet scrape services report identical cAdvisor metrics:

1. `prod-kube-prometheus-stack-kubelet`
2. `dg-prometheus-stack-kubelet` (leftover from DG self-hosted Helm)
3. `prod-omi-kube-prometheus-s-kubelet`

The alert query does `sum by (pod)` on the numerator (memory usage) which sums all 3 sources, but the denominator (limits from `kube-state-metrics`) has only 1 source. Result: ~3x inflated utilization.

## Fix

### 1. Alert query fix (immediate)

Add `service="prod-kube-prometheus-stack-kubelet"` to the numerator:

**Current:**
```promql
sum(container_memory_working_set_bytes{namespace="prod-omi-backend", container!="", image!=""} * on(namespace,pod) group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{...}) by (pod)
```

**Fixed:**
```promql
sum(container_memory_working_set_bytes{namespace="prod-omi-backend", container!="", image!="", service="prod-kube-prometheus-stack-kubelet"} * on(namespace,pod) group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{...}) by (pod)
```

### 2. Deduplicate Prometheus scrapes (follow-up)

Investigate and remove the duplicate kubelet scrape configs:
- `dg-prometheus-stack-kubelet` — likely installed with Deepgram self-hosted Helm chart
- `prod-omi-kube-prometheus-s-kubelet` — likely a second prometheus-stack install

**Note:** This triple-counting likely affects ALL memory/CPU alerts across the cluster, not just backend-listen.

## Verification

After fix, the hottest pod should report ~27% (not ~82%).

Driver: @mon-agent
CC: @thaingnguyen

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix backend-listen memory alert query (3x metric triple-counting) #5372

Problem

Fix

1. Alert query fix (immediate)

2. Deduplicate Prometheus scrapes (follow-up)

Verification

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fix backend-listen memory alert query (3x metric triple-counting) #5372

Description

Problem

Fix

1. Alert query fix (immediate)

2. Deduplicate Prometheus scrapes (follow-up)

Verification

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions