Skip to content

Commit a23c61d

Browse files
authored
[REMED-230] Adding template for OOMKilled monitors (#21343)
* Adding template for oom killed monitors * Adding required link in manifest file
1 parent a73b879 commit a23c61d

File tree

2 files changed

+34
-0
lines changed

2 files changed

+34
-0
lines changed
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
{
2+
"version": 2,
3+
"created_at": "2025-09-15",
4+
"last_updated_at": "2025-09-15",
5+
"title": "Pod is in an OOMKilled state",
6+
"tags": [
7+
"integration:kubernetes"
8+
],
9+
"description": "The status OOMKilled means that a container was killed because it exceeded memory limits or the node ran out of available memory. This monitor tracks when a pod is in an OOMKilled state for your Kubernetes integration.",
10+
"definition": {
11+
"message": "pod {{pod_name.name}} is in OOMKilled on {{kube_namespace.name}} \n This could happen for several reasons, for example insufficient memory limits, memory leaks in the application, or the node running out of available memory.",
12+
"name": "[Kubernetes] Pod {{pod_name.name}} is OOMKilled on namespace {{kube_namespace.name}}",
13+
"options": {
14+
"escalation_message": "",
15+
"include_tags": true,
16+
"locked": false,
17+
"new_host_delay": 300,
18+
"notify_audit": true,
19+
"notify_no_data": false,
20+
"renotify_interval": 0,
21+
"require_full_window": false,
22+
"thresholds": {
23+
"critical": 1
24+
},
25+
"timeout_h": 0
26+
},
27+
"query": "max(last_10m):default_zero(max:kubernetes_state.container.status_report.count.waiting{reason:oomkilled} by {kube_cluster_name,kube_namespace,pod_name}) >= 1",
28+
"tags": [
29+
"integration:kubernetes"
30+
],
31+
"type": "query alert"
32+
}
33+
}

kubernetes/manifest.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@
7171
"Nodes are unavailable": "assets/monitors/monitor_node_unavailable.json",
7272
"Pod is in a CrashloopBackOff state": "assets/monitors/monitor_pod_crashloopbackoff.json",
7373
"Pod is in an ImagePullBackOff state": "assets/monitors/monitor_pod_imagepullbackoff.json",
74+
"Pod is in an OOMKilled state": "assets/monitors/monitor_pod_oomkilled.json",
7475
"Pods are failing": "assets/monitors/monitor_pods_failed_state.json",
7576
"Pods are restarting": "assets/monitors/monitor_pods_restarting.json",
7677
"Kubernetes Statefulset Replicas are failing": "assets/monitors/monitor_statefulset_replicas.json"

0 commit comments

Comments
 (0)