Skip to content

Prevent allocating shards to broken nodes #18417

Closed
@ywelsch

Description

@ywelsch

Allocating shards to a node can fail for various reasons. When an allocation fails, we currently ignore the node for that shard during the next allocation round. However, this means that:

  • subsequent rounds consider the node for allocating the shard again.
  • other shards are still allocated to the node (in particular the balancer tries to put shards on that node with the failed shard as its weight becomes smaller).
    This is particularly bad if the node is permanently broken, leading to a never-ending series of failed allocations. Ultimately this affects the stability of the cluster.

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Distributed Coordination/AllocationAll issues relating to the decision making around placing a shard (both master logic & on the nodes)>bugTeam:Distributed (Obsolete)Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions