Skip to content

Webhook Notifier 429 Handling #2121

@justinwatkinson

Description

@justinwatkinson

What did you do?

Instrumented a webhook receiver to receive Alertmanager payloads to forward to a 3rd party system.

What did you expect to see?

When the destination webhook returns a 429 error, the retry logic with back-off should be applied.

What did you see instead? Under which circumstances?

429 is treated the same as a 404 or other client error, which does longer retries (10 sec according to my local tests).

Environment

  • System information:

Linux 4.15.0-54-generic x86_64

  • Alertmanager version:

Master (89db909) & 0.19.0

  • Prometheus version:

N/A

  • Alertmanager configuration file:
global:
  resolve_timeout: 5m

route:
  group_by: ['alertname']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 1h
  receiver: 'web.hook'
receivers:
- name: 'web.hook'
  webhook_configs:
  - url: 'http://127.0.0.1:9001/'

  • Prometheus configuration file:
    N/A

  • Logs:

level=error ts=2019-11-30T03:20:43.832Z caller=notify.go:372 component=dispatcher msg="Error on notify" err="unexpected status code 429: http://127.0.0.1:9001/" context_err="context deadline exceeded"
level=error ts=2019-11-30T03:20:43.832Z caller=dispatch.go:301 component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="unexpected status code 429: http://127.0.0.1:9001/"
level=error ts=2019-11-30T03:20:53.833Z caller=notify.go:372 component=dispatcher msg="Error on notify" err="Post http://127.0.0.1:9001/: dial tcp 127.0.0.1:9001: connect: connection refused" context_err="context deadline exceeded"
level=error ts=2019-11-30T03:20:53.833Z caller=dispatch.go:301 component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="Post http://127.0.0.1:9001/: dial tcp 127.0.0.1:9001: connect: connection refused"

Sample 429 Code

package main

import (
	"log"
	"net/http"
)

func handler(rw http.ResponseWriter, r *http.Request) {
	log.Println("Received call")
	rw.WriteHeader(http.StatusTooManyRequests)
}

func main() {
	http.HandleFunc("/", handler)
	http.ListenAndServe(":9001", nil)
}

Sample Logs from http server

2019/11/29 21:19:34 Received call
2019/11/29 21:19:44 Received call
2019/11/29 21:19:54 Received call

Sample Logs when adding 429 to retry (same as 5xx):

2019/11/29 21:20:34 Received call
2019/11/29 21:20:34 Received call
2019/11/29 21:20:35 Received call
2019/11/29 21:20:37 Received call
2019/11/29 21:20:39 Received call
2019/11/29 21:20:42 Received call
2019/11/29 21:20:43 Received call
2019/11/29 21:20:44 Received call
2019/11/29 21:20:45 Received call
2019/11/29 21:20:46 Received call
2019/11/29 21:20:48 Received call

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions