Open
Description
Shrink can sometimes fail with no obvious cause, leading to trouble with ILM (and particularly stopping ILM).
I've only seen this occur a few times, and in each case the relevant logs had aged out by the time I got to see the cluster with the problem. This issue is intended to track failures like this to see if we can spot any patterns.
One example is an ILM explain output from a v7.1.1 that has a step_info
like this:
"phase": "warm",
"action": "shrink",
"step": "shrunk-shards-allocated",
"step_info": {
"message": "Waiting for shrunk index to be created",
"shrunk_index_exists": false,
"actual_shards": -1,
"all_shards_active": false
},
The index in question did not proceed from that step for roughly 10 days, with no obvious cause. The situation was fixed by removing ILM from the index. In this case, no shrunken index had been created, but I've seen cases where the shrunken index was created.
Activity