Skip to content

Exponential backoff on failed Elasticsearch node configuration for X-pack monitoring #7966

Closed
@ppf2

Description

Currently, if multiple hosts (https://www.elastic.co/guide/en/beats/filebeat/6.3/configuration-monitor.html#_literal_hosts_literal_3) are specified for X-pack monitoring, if 1 host fails, it will continue to round robin requests to the other host(s). Say we have 2 hosts in the array, if 1 host goes down, it will send all the monitoring requests to the 2nd host until the 1st host returns. During this time, we continue to perform healthchecks against the problem host:

2018-08-14T14:27:44.022-0700	ERROR	pipeline/output.go:74	Failed to connect: X-Pack capabilities query failed with: Get http://localhost:9201/_xpack?filter_path=features.monitoring.enabled: dial tcp [::1]:9201: getsockopt: connection refused

This occurs every 10s until the host returns. For a network with many beat clients, a failure against 1 host in the monitoring hosts array, can result in a lot of additional packets, proportional to the # of beats running in the network. We may want to consider implementing an exponential backoff of these healthchecks against a failed host given this.

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions