Description
Elasticsearch version: 6.4.3/6.5.1
JVM version: 1.8.0.181
OS version: CentOS 7.4
Description of the problem including expected versus actual behavior:
Product environment: 15 nodes, 2700+ indices, 15000+ shards.
Cluster gets hanged after setting "cluster.routing.allocation.node_concurrent_recoveries" up to 100.
Steps to reproduce:
- Setup a three nodes cluster. 1 core, 2GB per node.
- Create 300 indices, 3000 shards. Each index with 100 documents.
- Set these parameters for cluster dynamically:
curl -X PUT "localhost:9200/_cluster/settings" -H 'Content-Type: application/json' -d'
{
"persistent": {
"cluster.routing.allocation.node_concurrent_recoveries": 100,
"indices.recovery.max_bytes_per_sec": "400mb"
}
}' - Stop one of the nodes and remove data from data path.
- Startup the stopped node.
- After a while, cluster got hanged.
We could see each node's generic thread pool used up to 128 which is full.
[c_log@VM_128_27_centos ~/elasticsearch-6.4.3/bin]$ curl localhost:9200/_cat/thread_pool/generic?v
node_name name active queue rejected
node-3 generic 128 949 0
node-2 generic 128 1093 0
node-1 generic 128 1076 0
Lot's of peer recoveries are waiting:
Jstack output for hanged node, all generic threads are waiting on txGet:
"elasticsearch[node-3][generic][T#128]" #179 daemon prio=5 os_prio=0 tid=0x00007fa8980c8800 nid=0x3cb9 waiting on condition [0x00007fa86ca0a000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000fbef56f0> (a org.elasticsearch.common.util.concurrent.BaseFuture$Sync) at java.util.concurrent.locks.LockSupport.park(Unknown Source) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(Unknown Source) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(Unknown Source) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(Unknown Source) at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:251) at org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:94) at org.elasticsearch.transport.PlainTransportFuture.txGet(PlainTransportFuture.java:44) at org.elasticsearch.transport.PlainTransportFuture.txGet(PlainTransportFuture.java:32) at org.elasticsearch.indices.recovery.RemoteRecoveryTargetHandler.receiveFileInfo(RemoteRecoveryTargetHandler.java:133) at org.elasticsearch.indices.recovery.RecoverySourceHandler.lambda$phase1$6(RecoverySourceHandler.java:387) at org.elasticsearch.indices.recovery.RecoverySourceHandler$$Lambda$3071/1370938617.run(Unknown Source) at org.elasticsearch.common.util.CancellableThreads.executeIO(CancellableThreads.java:105) at org.elasticsearch.common.util.CancellableThreads.execute(CancellableThreads.java:86) at org.elasticsearch.indices.recovery.RecoverySourceHandler.phase1(RecoverySourceHandler.java:386) at org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:172) at org.elasticsearch.indices.recovery.PeerRecoverySourceService.recover(PeerRecoverySourceService.java:98) at org.elasticsearch.indices.recovery.PeerRecoverySourceService.access$000(PeerRecoverySourceService.java:50) at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:107) at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:104) at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30) at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:251) at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:309) at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1605) at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:723) at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)
So, cluster should get hanged in distributed deadlocks.
Thanks,
Howard