Description
Currently, if multiple hosts (https://www.elastic.co/guide/en/beats/filebeat/6.3/configuration-monitor.html#_literal_hosts_literal_3) are specified for X-pack monitoring, if 1 host fails, it will continue to round robin requests to the other host(s). Say we have 2 hosts in the array, if 1 host goes down, it will send all the monitoring requests to the 2nd host until the 1st host returns. During this time, we continue to perform healthchecks against the problem host:
2018-08-14T14:27:44.022-0700 ERROR pipeline/output.go:74 Failed to connect: X-Pack capabilities query failed with: Get http://localhost:9201/_xpack?filter_path=features.monitoring.enabled: dial tcp [::1]:9201: getsockopt: connection refused
This occurs every 10s until the host returns. For a network with many beat clients, a failure against 1 host in the monitoring hosts array, can result in a lot of additional packets, proportional to the # of beats running in the network. We may want to consider implementing an exponential backoff of these healthchecks against a failed host given this.