Description
For ILM, we have a step that allocates an index on a single machine so that we can then call the shrink/resize action, however, in some cases, the shrink can be run after allocating the index to a single node, but still error out related to the shards not being on the same node:
"test-000019" : {
"step" : "ERROR",
"step_time" : 1540588519429,
"step_info" : {
"type" : "illegal_state_exception",
"reason" : "index test-000019 must have all shards allocated on the same node to shrink index"
}
},
I was able to reproduce this with the following configuration:
- 2 nodes with the "hot" type
- 3 nodes with the "cold" type
- 1 node with the "other" type
Using a 1 second poll interval:
PUT /_cluster/settings
{
"transient": {
"logger.org.elasticsearch.xpack.core.indexlifecycle": "TRACE",
"logger.org.elasticsearch.xpack.indexlifecycle": "TRACE",
"indices.lifecycle.poll_interval": "1s"
}
}
The following policy:
PUT _ilm/my_lifecycle3
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_age": "5s"
}
}
},
"warm": {
"minimum_age": "30s",
"actions": {
"forcemerge": {
"max_num_segments": 1
},
"shrink": {
"number_of_shards": 1
},
"allocate": {
"include": {
"type": ""
},
"exclude": {},
"require": {}
}
}
},
"cold": {
"minimum_age": "1m",
"actions": {
"allocate": {
"number_of_replicas": 2,
"include": {
"type": "cold"
},
"exclude": {},
"require": {}
}
}
},
"delete": {
"minimum_age": "2m",
"actions": {
"delete": {}
}
}
}
}
}
Index template:
PUT _template/my_template
{
"index_patterns": ["test-*"],
"settings": {
"number_of_shards": 2,
"number_of_replicas": 1,
"index.lifecycle.name": "my_lifecycle3",
"index.lifecycle.rollover_alias": "test-alias",
"index.routing.allocation.include.type": "hot"
}
}
Then, I created an index:
PUT test-000001
{
"aliases": {
"test-alias":{
"is_write_index": true
}
}
}
And then continually ran:
GET /*/_ilm/explain?filter_path=indices.*.step*
Until I saw a failure similar to the one above (took 1-30 minutes to reproduce).
I've added additional logging to see what's going on with the node:
[2018-10-26T15:15:19,399][TRACE][o.e.x.i.ExecuteStepsUpdateTask] [hot1] [test-000019] waiting for cluster state step condition (AllocationRoutedStep) [{"phase":"warm","action":"shrink","name":"check-allocation"}], next: [{"phase":"warm","action":"shrink","name":"shrink"}]
[2018-10-26T15:15:19,399][DEBUG][o.e.x.c.i.AllocationRoutedStep] [hot1] --> SHRINK checking whether [test-000019] has enough shards allocated
[2018-10-26T15:15:19,399][DEBUG][o.e.x.c.i.AllocationRoutedStep] [hot1] --> shard [test-000019][1], node[Mi73iCROTT2dM4We9oQIgA], [P], s[STARTED], a[id=IXX6Ix8EQdmsvhNT-7BQug] cannot remain on Mi73iCROTT2dM4We9oQIgA, allocPendingThisShard: 1
[2018-10-26T15:15:19,399][DEBUG][o.e.x.c.i.AllocationRoutedStep] [hot1] --> SHRINK shardCopiesThisShard(2) - allocationPendingThisShard(1) == 0 ? 1
[2018-10-26T15:15:19,399][DEBUG][o.e.x.c.i.AllocationRoutedStep] [hot1] --> shard [test-000019][0], node[RiSQ1bfhSkS_G90VZH-BLA], [R], s[STARTED], a[id=iCGSUFcYRXWl8yvDtcuhHg] cannot remain on RiSQ1bfhSkS_G90VZH-BLA, allocPendingThisShard: 1
[2018-10-26T15:15:19,399][DEBUG][o.e.x.c.i.AllocationRoutedStep] [hot1] --> SHRINK shardCopiesThisShard(2) - allocationPendingThisShard(1) == 0 ? 1
[2018-10-26T15:15:19,399][DEBUG][o.e.x.c.i.AllocationRoutedStep] [hot1] SHRINK [shrink] lifecycle action for index [[test-000019/pIKgUp5bTpCxZhJMOAWRxg]] complete
[2018-10-26T15:15:19,399][DEBUG][o.e.x.c.i.AllocationRoutedStep] [hot1] --> test-000019 SUCCESS allocationPendingAllShards: 0
[2018-10-26T15:15:19,399][TRACE][o.e.x.i.ExecuteStepsUpdateTask] [hot1] [test-000019] cluster state step condition met successfully (AllocationRoutedStep) [{"phase":"warm","action":"shrink","name":"check-allocation"}], moving to next step {"phase":"warm","action":"shrink","name":"shrink"}
And then a bit further down:
[2018-10-26T15:15:19,428][ERROR][o.e.x.i.IndexLifecycleRunner] [hot1] policy [my_lifecycle3] for index [test-000019] failed on step [{"phase":"warm","action":"shrink","name":"shrink"}]. Moving to ERROR step
java.lang.IllegalStateException: index test-000019 must have all shards allocated on the same node to shrink index
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService.validateShrinkIndex(MetaDataCreateIndexService.java:679) ~[elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService.prepareResizeIndexSettings(MetaDataCreateIndexService.java:740) ~[elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$IndexCreationTask.execute(MetaDataCreateIndexService.java:406) ~[elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:45) ~[elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:639) ~[elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:268) ~[elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:198) [elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:133) [elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) [elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) [elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:624) [elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244) [elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207) [elasticsearch-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
It looks like the check succeeds and that the shards are in the right place, but then the shrink fails nonetheless.
It's worth noting I could only reproduce this with a 1 second poll interval, so it may be a timing issue. Also, it does appear that the shard is correctly allocated from /_cat/shards output (hot2
is the node that ILM set as the _name
allocation filtering target):
test-000019 1 r STARTED 0 261b 127.0.0.1 hot2
test-000019 1 p STARTED 0 261b 127.0.0.1 hot1
test-000019 0 p STARTED 0 261b 127.0.0.1 hot2
test-000019 0 r STARTED 0 261b 127.0.0.1 other