Configure an email alert notification for all Prometheus Postgres replication alerts: PostgresReplicationLagSMA
, PostgresqlFollowerReplicationLagSMA
, and PostgresqlFollowerReplicationLagServices
This procedure can be performed on any master or worker NCN.
) Save the current alert notification configuration, in case a rollback is needed.kubectl get secret -n sysmgmt-health alertmanager-cray-sysmgmt-health-promet-alertmanager \ -ojsonpath='{.data.alertmanager.yaml}' | base64 --decode > /tmp/alertmanager-default.yaml
Create a secret and an alert configuration that will be used to add email notifications for the alerts.
) Create the secret file.Create a file named
with the following contents:apiVersion: v1 data: alertmanager.yaml: ALERTMANAGER_CONFIG kind: Secret metadata: labels: app: prometheus-operator-alertmanager chart: prometheus-operator-8.15.4 heritage: Tiller release: cray-sysmgmt-health name: alertmanager-cray-sysmgmt-health-promet-alertmanager namespace: sysmgmt-health type: Opaque
) Create the alert configuration file.In the following example file, the Gmail SMTP server is used in this example to relay the notification to
. Update the fields underemail_configs:
to reflect the desired configuration.Create a file named
with the following contents:global: resolve_timeout: 5m route: group_by: - job group_interval: 5m group_wait: 30s receiver: "null" repeat_interval: 12h routes: - match: alertname: Watchdog receiver: "null" - match: alertname: PostgresqlReplicationLagSMA receiver: email-alert - match: alertname: PostgresqlReplicationLagServices receiver: email-alert - match: alertname: PostgresqlFollowerReplicationLagSMA receiver: email-alert - match: alertname: PostgresqlFollowerReplicationLagServices receiver: email-alert receivers: - name: "null" - name: email-alert email_configs: - to: from: # Your smtp server address smarthost: auth_username: auth_identity: auth_password: xxxxxxxxxxxxxxxx
) Replace the alert notification configuration based on the files created in the previous steps.sed "s/ALERTMANAGER_CONFIG/$(cat /tmp/alertmanager-new.yaml \ | base64 -w0)/g" /tmp/alertmanager-secret.yaml \ | kubectl replace --force -f -
Validate the configuration changes.
) View the current configuration.kubectl exec alertmanager-cray-sysmgmt-health-promet-alertmanager-0 \ -n sysmgmt-health -c alertmanager -- cat /etc/alertmanager/config/alertmanager.yaml
) If the configuration does not look accurate, check the logs for errors.kubectl logs -f -n sysmgmt-health pod/alertmanager-cray-sysmgmt-health-promet-alertmanager-0 alertmanager
An email notification will be sent once either of the alerts set in this procedure is FIRING
in Prometheus.
See https://prometheus.SYSTEM-NAME.SITE-DOMAIN/alerts
for more information.
If an alert is received, then refer to Troubleshoot Postgres Database for more information about recovering replication.