Skip to content

SnifferBuilder.sniffOnFailureListener() not honored #27697

Closed
@eranya

Description

@eranya

Elasticsearch version (bin/elasticsearch --5.6.4):

Plugins installed: []

JVM version (1.8):

OS version (Ubuntu):

Description of the problem including expected versus actual behavior:
SnifferBuilder.setSniffAfterFailureDelayMillis is not honored and not called based on the supplied interval. Instead the interval sniffer is using is SnifferBuilder.setSniffIntervalMillis

Steps to reproduce:

Use following code which was taken from elastic doc :

SniffOnFailureListener sniffOnFailureListener = new SniffOnFailureListener();
RestClient restClient = RestClient.builder(new HttpHost("localhost", 9200))
    .setFailureListener(sniffOnFailureListener) 
    .build();
Sniffer sniffer = Sniffer.builder(restClient)
    .setSniffAfterFailureDelayMillis(30000) 
    .build();
sniffOnFailureListener.setSniffer(sniffer); 

I think the reason of the bug is in method Sniffer.Task.sniff().
The first call to Sniffer.Task.sniff() is done for setSniffIntervalMillis timeout which is blocked in hostsSniffer.sniffHosts().
During that time, for each failed node, Sniffer.Task.sniff() is called with setSniffAfterFailureDelayMillis timeout. Since the first call is still blocking in Sniffer.Task.sniff() it cannot call scheduleNextRun with setSniffAfterFailureDelayMillis because if (running.compareAndSet(false, true)) condition prevents it from. When all failed nodes are sniffed, hostsSniffer.sniffHosts() returns and set the time again to setSniffIntervalMillis.

I think the in the Exception statement, you should set nextSniffDelayMillis to be equal to sniffAfterFailureDelayMillis and in case of success set nextSniffDelayMillis to setSniffIntervalMillis.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions