Skip to content

LLRC can throw an IllegalCapacityException when selecting nodes. #37739

Closed
@jtibshirani

Description

@jtibshirani

This issue was originally debugged and reported by a user. It seems that when there are nodes failures, the low-level RestClient may have a race condition around keeping track of the valid nodes. Here's the relevant text from the report:

When too many nodes fail, an IllegalCapacityException is thrown in the low level RestClient.
Defective Code is in line 637:

List<Node> livingNodes = new ArrayList<>(nodeTuple.nodes.size() - blacklist.size());

In some cases nodeTuple.nodes.size() - blacklist.size() < 0, which causes an IllegalCapacityException thrown by the ArrayList constructor.

This causes the RestClient to fail on any further request (including requests issued by the sniffer), thus the client breaks down completely and will not recover.

I think the situation is caused by nodes failing. Then:

  • The sniffer removes a node using RestClient.setNodes(...)
  • There are async requests running on that nodes that fail after the sniffer removed the node
  • Thus, the node (which is not contained in the RestClient.nodeTuple list anymore) is added to the RestClient.blackList during RestClient.onFailure
  • Therefore, a situation where more nodes are blacklisted that nodes are known can occur.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions