Help with deadman switch #3227
-
Looking for some help with setting up a dead mans switch. I'm still wrapping my head around alertmanger. I've been implementing rules from here if they fit my environment. I'm trying to setup a deadman's switch, but I can't figure out how to invert the alert for this particular one. Should I be trying to setup a route that discards particular alerts, or should that be done at the receiver? If so, how? I just want to get an alert when this alert isn't firing. Also, for an always firing alert, do you just silence it for an incredibly long time? Or can you negate that too somehow? I'd like to have the firing alert count always be zero rather than one. Thanks in advance. My setup: prometheus rule:
My relevant alertmanager config:
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Should this be asked elsewhere? Or should I just ask better questions? :) |
Beta Was this translation helpful? Give feedback.
-
By definition the deadman snitch alert (or You can also find an example of route + receiver configuration at https://github.com/gouthamve/deadman |
Beta Was this translation helpful? Give feedback.
By definition the deadman snitch alert (or
Watchdog
) needs to be always firing. It's usual to set the severity to something other than critical. See here for instance:https://github.com/prometheus-operator/kube-prometheus/blob/c936a999acdbee7b1134bcf4be230e458d3ed9cd/manifests/kubePrometheus-prometheusRule.yaml#L27-L40
You can also find an example of route + receiver configuration at https://github.com/gouthamve/deadman