Skip to content

The node reports an error during the recovery process #43228

Closed
@medivn

Description

@medivn

elasticsearch version:6.2.4
The node reports an error during the recovery process. From the code point of view, it is because there is no judgment whether the string is empty, but why does the Translog have such abnormal data that causes the error?
The log is as follows

[2019-06-14T13:54:10,636][WARN ][o.e.i.c.IndicesClusterStateService] [node-2] [[indexName][1]] marking and sending shard failed due to [failed recovery]
org.elasticsearch.indices.recovery.RecoveryFailedException: [indexName][1]: Recovery failed from {node-1}{NYI9biLCQ2W3uJ3mOu5W3A}{bu-VpLxTQvO1AvYwmTNc4g}{192.168.60.78}{192.168.60.78:9300}{ml.machine_memory=16726286336, ml.max_open_jobs=20, ml.enabled=true} into {node-2}{PCcLeyjkREeZ08kEuR0GdQ}{bSKzgm0iRIKLHEgVTeXuMA}{192.168.60.94}{192.168.60.94:9300}{ml.machine_memory=16726286336, ml.max_open_jobs=20, ml.enabled=true}
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.doRecovery(PeerRecoveryTargetService.java:288) [elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.access$900(PeerRecoveryTargetService.java:81) [elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRunner.doRun(PeerRecoveryTargetService.java:635) [elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672) [elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.4.jar:6.2.4]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
Caused by: org.elasticsearch.transport.RemoteTransportException: [node-1][192.168.60.78:9300][internal:index/shard/recovery/start_recovery]
Caused by: org.elasticsearch.index.engine.RecoveryEngineException: Phase[2] phase2 failed
	at org.elasticsearch.indices.recovery.RecoverySourceHandler.recoverToTarget(RecoverySourceHandler.java:211) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.recover(PeerRecoverySourceService.java:98) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService.access$000(PeerRecoverySourceService.java:50) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:107) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.indices.recovery.PeerRecoverySourceService$StartRecoveryTransportRequestHandler.messageReceived(PeerRecoverySourceService.java:104) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1555) ~[elasticsearch-6.2.4.jar:6.2.4]
	... 5 more
Caused by: org.elasticsearch.transport.RemoteTransportException: [node-2][192.168.60.94:9300][internal:index/shard/recovery/translog_ops]
Caused by: org.elasticsearch.index.translog.TranslogException: Failed to write operation [NoOp{seqNo=452743, primaryTerm=18, reason='null'}]
	at org.elasticsearch.index.translog.Translog.add(Translog.java:500) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:890) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.index.shard.IndexShard.index(IndexShard.java:738) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:707) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:1245) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.indices.recovery.RecoveryTarget.indexTranslogOperations(RecoveryTarget.java:403) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.messageReceived(PeerRecoveryTargetService.java:460) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.messageReceived(PeerRecoveryTargetService.java:451) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1555) ~[elasticsearch-6.2.4.jar:6.2.4]
	... 5 more
Caused by: java.lang.NullPointerException
	at org.elasticsearch.common.io.stream.StreamOutput.writeString(StreamOutput.java:320) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.index.translog.Translog$NoOp.write(Translog.java:1333) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.index.translog.Translog$NoOp.access$500(Translog.java:1295) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.index.translog.Translog$Operation.writeOperation(Translog.java:908) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.index.translog.Translog.writeOperationNoSize(Translog.java:1475) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.index.translog.Translog.add(Translog.java:476) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.index.engine.InternalEngine.index(InternalEngine.java:890) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.index.shard.IndexShard.index(IndexShard.java:738) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:707) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.index.shard.IndexShard.applyTranslogOperation(IndexShard.java:1245) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.indices.recovery.RecoveryTarget.indexTranslogOperations(RecoveryTarget.java:403) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.messageReceived(PeerRecoveryTargetService.java:460) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.messageReceived(PeerRecoveryTargetService.java:451) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.transport.TransportRequestHandler.messageReceived(TransportRequestHandler.java:30) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:66) ~[elasticsearch-6.2.4.jar:6.2.4]
	at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1555) ~[elasticsearch-6.2.4.jar:6.2.4]
	... 5 more

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Distributed Indexing/RecoveryAnything around constructing a new shard, either from a local or a remote source.>bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions