Description
Today, if we detect shard corruption then we mark the store as corrupt and refuse to open it again. If there are no replicas then you might be able to use Lucene’s CheckIndex to remove the corrupted segments, although this does not remove the corruption marker, requires knowledge of our filesystem layout, and might be tricky to do in a containerised or heavily automated environment. The only way forward via the API is to force the allocation of an empty primary which drops all the data in the shard. We have an index.shard.check_on_startup: fix
setting but this is suboptimal for a couple of reasons:
- it’s index-wide and requires closing and verifying the whole index.
- it has no effect on shards that have a corruption marker, because the corruption marker is checked before this option takes effect.
(it also does nothing in versions 6.0 and above, but that's another story)
The Right Way™ to recover a corrupted shard is certainly to fail it and recover another copy from one of its replicas, assuming such a thing exists, but we’ve seen a couple of cases recently where a user was running without replicas, e.g. to do a bulk load of data (which we sorta suggest might be a good idea sometimes) and hit some corruption that they'd have preferred to recover from with a bit of data loss rather than by restarting the load or allocating an empty primary.
I propose removing the fix
option of the index.shard.check_on_startup
setting and instead adding another dangerous forced allocation command that can attempt to allocate a primary on top of a corrupt store by fixing the store and removing its corruption marker.
/cc @tsouza @ywelsch re. this forum thread
Actual points and opened questions:
- Tool name:
elasticsearch-shard
with subcommandremove-corrupted-segments
- the main goal is to fix corrupted index - the action is destructive - therefore no any fix or repair, avoid truncate as it is far from Lucene terminology
- Available options for
remove-corrupted-segments
:--index-name index_name
and--shard-id shard_id
(mandatory)- alternative:
-d path_to_index_folder
or--dir path_to_index_folder
- alternative:
--dry-run
do fast check without actual dropping of corrupted segments- no options means
exorcise
- interactive keyboard confirmation is required
- merge
elasticsearch-translog
intoelasticsearch-shard
elasticsearch-translog
becomeselasticsearch-shard truncate-translog
elasticsearch-translog
has only-d
option to specify folder - it would be nice to have--index-name index_name
and--shard-id shard_id
- Exit immediately if there is no corruption marker file
- for both cases
- actually missed segments are unrecoverable case with
checkIndex
- we leave it as unrecoverable case - with referring to how to allocate an empty shard
- there is a room for improvement - LUCENE-6762.