Closed
Description
This issue was originally debugged and reported by a user. It seems that when there are nodes failures, the low-level RestClient
may have a race condition around keeping track of the valid nodes. Here's the relevant text from the report:
When too many nodes fail, an IllegalCapacityException is thrown in the low level RestClient.
Defective Code is in line 637:List<Node> livingNodes = new ArrayList<>(nodeTuple.nodes.size() - blacklist.size());
In some cases
nodeTuple.nodes.size() - blacklist.size() < 0
, which causes anIllegalCapacityException
thrown by theArrayList
constructor.This causes the
RestClient
to fail on any further request (including requests issued by the sniffer), thus the client breaks down completely and will not recover.I think the situation is caused by nodes failing. Then:
- The sniffer removes a node using RestClient.setNodes(...)
- There are async requests running on that nodes that fail after the sniffer removed the node
- Thus, the node (which is not contained in the RestClient.nodeTuple list anymore) is added to the RestClient.blackList during RestClient.onFailure
- Therefore, a situation where more nodes are blacklisted that nodes are known can occur.