Skip to content

Enhance BufferedChecksumIndexInput error when a state file is empty #29358

Closed
@gmoskovicz

Description

@gmoskovicz

Elasticsearch version (bin/elasticsearch --version): Any version

Plugins installed: [none]

JVM version (java -version): 1.8

OS version (uname -a if on a Unix-like system): Any

Provide logs (if relevant):

If a state file is empty after some issue (hardware issue, NFS issue and others) a node will fail to start and we will see something like the following in the logs:

[2018-04-03T18:12:25,227][ERROR][o.e.g.GatewayMetaState   ] [xxxx] failed to read local state, exiting...
org.elasticsearch.ElasticsearchException: java.io.IOException: failed to read [id:20, legacy:false, file:/xxx/elasticsearch/xxx/xxx/nodes/0/indices/indexname/_state/state-20.st]
	at org.elasticsearch.ExceptionsHelper.maybeThrowRuntimeAndSuppress(ExceptionsHelper.java:150) ~[elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.gateway.MetaDataStateFormat.loadLatestState(MetaDataStateFormat.java:334) ~[elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.util.IndexFolderUpgrader.upgrade(IndexFolderUpgrader.java:90) ~[elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.util.IndexFolderUpgrader.upgradeIndicesIfNeeded(IndexFolderUpgrader.java:128) ~[elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.gateway.GatewayMetaState.<init>(GatewayMetaState.java:87) [elasticsearch-5.4.3.jar:5.4.3]
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:?]
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) [?:1.8.0_161]
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) [?:1.8.0_161]
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423) [?:1.8.0_161]
	at org.elasticsearch.common.inject.DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:49) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.inject.ConstructorInjector.construct(ConstructorInjector.java:86) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.inject.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:116) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:47) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:825) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:43) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.inject.Scopes$1$1.get(Scopes.java:59) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.inject.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:50) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.inject.InjectorBuilder$1.call(InjectorBuilder.java:191) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.inject.InjectorBuilder$1.call(InjectorBuilder.java:183) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:818) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.inject.InjectorBuilder.loadEagerSingletons(InjectorBuilder.java:183) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.inject.InjectorBuilder.loadEagerSingletons(InjectorBuilder.java:176) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.inject.InjectorBuilder.injectDynamically(InjectorBuilder.java:161) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:96) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.inject.Guice.createInjector(Guice.java:96) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.inject.Guice.createInjector(Guice.java:70) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.common.inject.ModulesBuilder.createInjector(ModulesBuilder.java:43) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.node.Node.<init>(Node.java:491) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.node.Node.<init>(Node.java:242) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:232) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:232) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:350) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:123) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:114) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:67) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.cli.Command.main(Command.java:88) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:91) [elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:84) [elasticsearch-5.4.3.jar:5.4.3]
Caused by: java.io.IOException: failed to read [id:20, legacy:false, file:/xxx/elasticsearch/xxx/xxx/nodes/0/indices/indexname/_state/state-20.st]
	at org.elasticsearch.gateway.MetaDataStateFormat.loadLatestState(MetaDataStateFormat.java:327) ~[elasticsearch-5.4.3.jar:5.4.3]
	... 37 more
Caused by: java.lang.IllegalStateException: class org.apache.lucene.store.BufferedChecksumIndexInput cannot seek backwards (pos=-16 getFilePointer()=0)
	at org.apache.lucene.store.ChecksumIndexInput.seek(ChecksumIndexInput.java:50) ~[lucene-core-6.5.1.jar:6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 - jimczi - 2017-04-21 12:17:15]
	at org.apache.lucene.codecs.CodecUtil.checksumEntireFile(CodecUtil.java:519) ~[lucene-core-6.5.1.jar:6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 - jimczi - 2017-04-21 12:17:15]
	at org.elasticsearch.gateway.MetaDataStateFormat.read(MetaDataStateFormat.java:189) ~[elasticsearch-5.4.3.jar:5.4.3]
	at org.elasticsearch.gateway.MetaDataStateFormat.loadLatestState(MetaDataStateFormat.java:322) ~[elasticsearch-5.4.3.jar:5.4.3]
	... 37 more

Discussing it with @jpountz , (pos=-16 getFilePointer()=0) seems to indicate that the file is empty, but so far we return a IOException without explaining that a state file is empty.

So my question here is:

  1. Does it make sense to add some extra content to the log to explicitly mention that the file is empty?
  2. Perhaps we can mention in the logs that when this happens the shards get corrupted and you need to recreate it (or remove it)?
  3. Do we see any other option than [2]? For example starting the node and have an option to disregard that shard (possibly delete it?- yes doesn't sounds like a very good idea, but worth discussing it).

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Distributed Indexing/DistributedA catch all label for anything in the Distributed Indexing Area. Please avoid if you can.>testIssues or PRs that are addressing/adding testsTeam:Distributed (Obsolete)Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.help wantedadoptme

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions