Skip to content

Concurrent deletion of indices and master failure can cause indices to be reimported #11665

Closed
@brwe

Description

@brwe

Currently, a data node deletes indices by evaluating the cluster state. If a new cluster state comes in it is compared to the last known cluster state, and if the new state does not contain an index that the node has in its last cluster state, then this index is deleted.

This could cause data to be deleted if the data folder of all master nodes was lost (#8823):

All master nodes of a cluster go down at the same time and their data folders cannot be recovered.
A new master is brought up but it does not have any indices in its cluster state because the data was lost.
Because all other node are data nodes it cannot get the cluster state from them too and therefore sends a cluster state without any indices in it to the data nodes. The data nodes then delete all their data.

On the master branch we prevent this now by checking if the current cluster state comes from a different master than the previous one and if so, we keep the indices and import them as dangling (see #9952, ClusterChangedEvent).

While this prevents the deletion, it also means that we might in other cases not delete indices although we should.

Example:

  1. two masters eligible nodes, m1 is master, one data node (d).
  2. m1, m2 and d are on cluster state version 1 that contains and index
  3. The index is deleted through the API, causing m1 to send cluster state 2 which does not contain the index to m2 and d that should trigger the actual index deletion.
  4. m1 goes down
  5. m2 receives the new cluster state but d does not (network issues etc)
  6. m2 is elected master and sends cluster state 3 to d which again does not contain the index
  7. d will not delete the index because the state comes from a different master than cluster state 1 (the last one it knows of) and will therefore not delete the index and instead import it back into the cluster

Currently there is no way for a data node to decide if an index should actually be deleted or not if the cluster state that triggers the delete comes from a new master. We chose between: (1) deleting all data in case a node receives an empty cluster state or (2) run the risk to keep indices around that should actually be deleted.

We decided for (2) in #9952. Just opening this issue so that this behavior is documented.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions