Skip to content

Error sending alert: bad response status 422 Unprocessable Entity #6053

Open
@mousimin

Description

@mousimin

Describe the bug
We are running micro services for cortex, we were using v1 version for alertmanager api by specifying flag -ruler.alertmanager-use-v2=false(used cortex v1.16.0), now we upgrade cortex to v1.17.1, from the log, I see we are using v2 version for alertmanager, when I create some alert rules, I see the alerts fire, but we can't get any email notification, meanwhile we are getting some error messages like:
caller=notifier.go:544 level=error user=Test alertmanager=https://cortex-alertmanager.org/alertmanager/api/v2/alerts count=1 msg="Error sending alert" err="bad response status 422 Unprocessable Entity"

To Reproduce
Steps to reproduce the behavior:

  1. Start Cortex (SHA or version): start cortex v1.17.1 with micro service mode
  2. Perform Operations(Read/Write/Others): create alert rule and observe the logs of ruler

Expected behavior
we should get the notifications and no error log should appear.

Environment:

  • Infrastructure: [e.g., Kubernetes, bare-metal, laptop]: bare-metal
  • Deployment tool: [e.g., helm, jsonnet]: we are using ansible to deploy systemd services for cortex micro services

Additional Context
configuration file for cortex ruler:

ExecStart=/usr/sbin/cortex-1.17.1 \
  -auth.enabled=true \
  -log.level=info \
  -config.file=/etc/cortex-ruler/cortex-ruler.yaml \
  -runtime-config.file=/etc/cortex-shared/cortex-runtime.yaml \
  -server.http-listen-port=8061 \
  -server.grpc-listen-port=9061 \
  -server.grpc-max-recv-msg-size-bytes=104857600 \
  -server.grpc-max-send-msg-size-bytes=104857600 \
  -server.grpc-max-concurrent-streams=1000 \
  \
  -distributor.sharding-strategy=shuffle-sharding \
  -distributor.ingestion-tenant-shard-size=12 \
  -distributor.replication-factor=2 \
  -distributor.shard-by-all-labels=true \
  -distributor.zone-awareness-enabled=true \
  \
  -store.engine=blocks \
  -blocks-storage.backend=s3 \
  -blocks-storage.s3.endpoint=s3.org:10444 \
  -blocks-storage.s3.bucket-name=staging-metrics \
  -blocks-storage.s3.insecure=false \
  \
  -blocks-storage.bucket-store.sync-dir=/local/cortex-ruler/tsdb-sync \
  -blocks-storage.bucket-store.metadata-cache.backend=memcached \
  -blocks-storage.bucket-store.metadata-cache.memcached.addresses=100.76.51.1:11211,100.76.51.2:11211,100.76.51.3:11211 \
  \
  -querier.active-query-tracker-dir=/local/cortex-ruler/active-query-tracker \
  -querier.ingester-streaming=true \
  -querier.query-store-after=23h \
  -querier.query-ingesters-within=24h \
  -querier.shuffle-sharding-ingesters-lookback-period=25h \
  \
  -store-gateway.sharding-enabled=true \
  -store-gateway.sharding-strategy=shuffle-sharding \
  -store-gateway.tenant-shard-size=6 \
  -store-gateway.sharding-ring.store=etcd \
  -store-gateway.sharding-ring.etcd.endpoints=10.120.121.1:2379 \
  -store-gateway.sharding-ring.etcd.endpoints=10.120.121.2:2379 \
  -store-gateway.sharding-ring.etcd.endpoints=10.120.121.3:2379 \
  -store-gateway.sharding-ring.etcd.endpoints=10.120.121.4:2379 \
  -store-gateway.sharding-ring.etcd.endpoints=10.120.121.5:2379 \
  -store-gateway.sharding-ring.prefix=cortex-store-gateways/ \
  -store-gateway.sharding-ring.replication-factor=2 \
  -store-gateway.sharding-ring.zone-awareness-enabled=true \
  -store-gateway.sharding-ring.instance-availability-zone=t1 \
  -store-gateway.sharding-ring.wait-stability-min-duration=1m \
  -store-gateway.sharding-ring.wait-stability-max-duration=5m \
  -store-gateway.sharding-ring.instance-addr=100.76.75.1 \
  -store-gateway.sharding-ring.instance-id=s_8061 \
  -store-gateway.sharding-ring.heartbeat-period=15s \
  -store-gateway.sharding-ring.heartbeat-timeout=1m \
  \
  -ring.store=etcd \
  -ring.prefix=cortex-ingesters/ \
  -ring.heartbeat-timeout=1m \
  -etcd.endpoints=10.120.119.1:2379 \
  -etcd.endpoints=10.120.119.2:2379 \
  -etcd.endpoints=10.120.119.3:2379 \
  -etcd.endpoints=10.120.119.4:2379 \
  -etcd.endpoints=10.120.119.5:2379 \
  \
  -ruler.enable-sharding=true \
  -ruler.sharding-strategy=shuffle-sharding \
  -ruler.tenant-shard-size=2 \
  -ruler.ring.store=etcd \
  -ruler.ring.prefix=cortex-rulers/ \
  -ruler.ring.num-tokens=32 \
  -ruler.ring.heartbeat-period=15s \
  -ruler.ring.heartbeat-timeout=1m \
  -ruler.ring.etcd.endpoints=10.120.119.1:2379 \
  -ruler.ring.etcd.endpoints=10.120.119.2:2379 \
  -ruler.ring.etcd.endpoints=10.120.119.3:2379 \
  -ruler.ring.etcd.endpoints=10.120.119.4:2379 \
  -ruler.ring.etcd.endpoints=10.120.119.5:2379 \
  -ruler.ring.instance-id=s_8061 \
  -ruler.ring.instance-interface-names=e1 \
  \
  -ruler.max-rules-per-rule-group=500 \
  -ruler.max-rule-groups-per-tenant=5000 \
  \
  -ruler.external.url=staging-cortex-ruler.org \
  -ruler.client.grpc-max-recv-msg-size=104857600 \
  -ruler.client.grpc-max-send-msg-size=16777216 \
  -ruler.client.grpc-compression= \
  -ruler.client.grpc-client-rate-limit=0 \
  -ruler.client.grpc-client-rate-limit-burst=0 \
  -ruler.client.backoff-on-ratelimits=false \
  -ruler.client.backoff-min-period=500ms \
  -ruler.client.backoff-max-period=10s \
  -ruler.client.backoff-retries=5 \
  -ruler.evaluation-interval=15s \
  -ruler.poll-interval=15s \
  -ruler.rule-path=/local/cortex-ruler/rules \
  -ruler.alertmanager-url=https://staging-cortex-alertmanager.org/alertmanager \
  -ruler.alertmanager-discovery=false \
  -ruler.alertmanager-refresh-interval=1m \
  -ruler.notification-queue-capacity=10000 \
  -ruler.notification-timeout=10s \
  -ruler.flush-period=1m \
  -experimental.ruler.enable-api=true \
  \
  -ruler-storage.backend=s3 \
  -ruler-storage.s3.endpoint=s3.org:10444 \
  -ruler-storage.s3.bucket-name=staging-rules \
  -ruler-storage.s3.insecure=false \
  \
  -target=ruler

configuration file for cortex alertmanager:

ExecStart=/usr/sbin/cortex-1.17.1 \
  -auth.enabled=true \
  -log.level=info \
  -config.file=/etc/cortex-alertmanager-8071/cortex-alertmanager.yaml \
  -runtime-config.file=/etc/cortex-shared/cortex-runtime.yaml \
  -server.http-listen-port=8071 \
  -server.grpc-listen-port=9071 \
  -server.grpc-max-recv-msg-size-bytes=104857600 \
  -server.grpc-max-send-msg-size-bytes=104857600 \
  -server.grpc-max-concurrent-streams=1000 \
  \
  -alertmanager.storage.path=/local/cortex-alertmanager-8071/data \
  -alertmanager.storage.retention=120h \
  -alertmanager.web.external-url=https://staging-cortex-alertmanager.org/alertmanager \
  -alertmanager.configs.poll-interval=1m \
  -experimental.alertmanager.enable-api=true \
  \
  -alertmanager.sharding-enabled=true \
  -alertmanager.sharding-ring.store=etcd \
  -alertmanager.sharding-ring.prefix=cortex-alertmanagers/ \
  -alertmanager.sharding-ring.heartbeat-period=15s \
  -alertmanager.sharding-ring.heartbeat-timeout=1m \
  -alertmanager.sharding-ring.etcd.endpoints=10.120.121.1:2379 \
  -alertmanager.sharding-ring.etcd.endpoints=10.120.121.2:2379 \
  -alertmanager.sharding-ring.etcd.endpoints=10.120.121.3:2379 \
  -alertmanager.sharding-ring.etcd.endpoints=10.120.121.4:2379 \
  -alertmanager.sharding-ring.etcd.endpoints=10.120.121.5:2379 \
  -alertmanager.sharding-ring.instance-id=b_8071 \
  -alertmanager.sharding-ring.instance-interface-names=e1 \
  -alertmanager.sharding-ring.replication-factor=2 \
  -alertmanager.sharding-ring.zone-awareness-enabled=true \
  -alertmanager.sharding-ring.instance-availability-zone=t1 \
  \
  -alertmanager-storage.backend=s3 \
  -alertmanager-storage.s3.endpoint=s3.org:10444 \
  -alertmanager-storage.s3.bucket-name=staging-alerts \
  -alertmanager-storage.s3.insecure=false \
  \
  -alertmanager.receivers-firewall-block-cidr-networks=10.163.131.164/28,10.163.131.180/28 \
  -alertmanager.receivers-firewall-block-private-addresses=true \
  -alertmanager.notification-rate-limit=0 \
  -alertmanager.max-config-size-bytes=0 \
  -alertmanager.max-templates-count=0 \
  -alertmanager.max-template-size-bytes=0 \
  \
  -target=alertmanager

the configuration for alertmanager:

template_files:
  default_template: |
    {{ define "__alertmanager" }}AlertManager{{ end }}
    {{ define "__alertmanagerURL" }}{{ .ExternalURL }}/#/alerts?receiver={{ .Receiver | urlquery }}{{ end }}
alertmanager_config: |
  global:
    smtp_smarthost: 'yourmailhost'
    smtp_from: 'youraddress'
    smtp_require_tls: false
  templates:
    - 'default_template'
  route:
    receiver: example-email
  receivers:
    - name: example-email
      email_configs:
      - to: 'youraddress'

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions