elasticsearch-shard remove-corrupted-data doesn't work on missing metadata

`elasticsearch-shard` appears to be the tool for removing corrupted metadata.
This has happened several times to us after updating past 7.0.0

Issue: directory structure and files are either deleted or never created, and `elasticsearch-shard` (remove-corrupted-data) can not remove it from the metadata

Steps:
recreate directory structure (as `elasticsearch-shard` errors out "directory must exist" if it does not)
run `elasticsearch-shard` (hits null pointer exception, because only directories exist)

> /usr/share/elasticsearch/bin $ ./elasticsearch-shard remove-corrupted-data --index dce_rpc-2019.08.28 --shard-id 24 -d /data/nsm/elasticsearch/nodes/0/indices/TUa5c332RFGKmM6yZSK-Rw/0/index
> ERROR StatusLogger No Log4j 2 configuration file found. Using default configuration (logging only errors to the console), or user programmatically provided configurations. Set system property 'log4j2.debug' to show Log4j 2 internal initialization logging. See https://logging.apache.org/log4j/2.x/manual/configuration.html for instructions on how to configure Log4j 2
> -----------------------------------------------------------------------
> 
>     WARNING: Elasticsearch MUST be stopped before running this tool.
> 
>   Please make a complete backup of your index before using this tool.
> 
> -----------------------------------------------------------------------
> Exception in thread "main" java.lang.NullPointerException
> 	at org.elasticsearch.index.shard.RemoveCorruptedShardDataCommand.findAndProcessShardPath(RemoveCorruptedShardDataCommand.java:152)
> 	at org.elasticsearch.index.shard.RemoveCorruptedShardDataCommand.execute(RemoveCorruptedShardDataCommand.java:282)
> 	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86)
> 	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124)
> 	at org.elasticsearch.cli.MultiCommand.execute(MultiCommand.java:77)
> 	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124)
> 	at org.elasticsearch.cli.Command.main(Command.java:90)
> 	at org.elasticsearch.index.shard.ShardToolCli.main(ShardToolCli.java:35)
> 
> 
> /usr/share/elasticsearch/bin $ ls  /data/nsm/elasticsearch/nodes/0/indices/TUa5c332RFGKmM6yZSK-Rw/0/
> index  _state  translog



What I'd expect:
Kill the shard and not have to `rm -rf` the entire node and rely on replicas.

Hopefully there's an error in my steps.


elastic 7.3.0 (no plugins)
oracle linux 7.6
network drives (vSAN) for elastic storage (though this happens on physical boxes with docker containers too)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

elasticsearch-shard remove-corrupted-data doesn't work on missing metadata #47435

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

elasticsearch-shard remove-corrupted-data doesn't work on missing metadata #47435

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions