Description
@ywelsch mentioned to me that one could easily run a force-merge on a read-only index on a single copy by doing the following sequence of operations:
- call the index clone API and set the number of replicas to 0 on the clone
- force-merge the clone
- set the number of replicas to the original value on the clone
- flip the alias so that the clone takes the place of the original index in the data stream
- delete the original index
This would only require half the CPU compared to running a plain force-merge, which is a great deal. Should we move the force-merge ILM action to this sequence of operations instead of just calling the _force_merge
API?
Interestingly it wouldn't require more temporary storage. Since the index clone API uses symlinks, it doesn't need additional storage initially, so we would only need 2x the size of a shard of temporary storage right after increasing the number of replicas of the clone to 1. This is the same as running the _force_merge
API since merging needs the same amount of temporary storage as the size of a shard copy, and this is needed of both shard copies.
For the searchable-snapshots ILM action, which optionally performs a forced merge, we could do something better by adding an option to _force_merge
to only merge primaries, and using it with the searchable-snapshots action. This would work since only primaries are used to take snapshots. And this would only require half the temporary storage that it needs today.