Description
Today we execute a partial reduce of search requests after we buffered at least 512
shard search results. The default, users can change this value with batched_reduce_size=N
, seems quite high and can cause memory issue for queries that target a large amount of shards. We also want to use the partial reduce to speed up the search on subsequent search shard request (#51852) but users won't see the benefit unless they reduce the batched reduce size explicitly. Partial (and final) reduce are usually very fast so I am opening this issue to use a sane default that could save memories on coordinating node and speed up sorted queries on time-based indices (queries that can target a lot of shards). We have plenty of options so here's a non-exhaustive list:
- Reduce the default to 5-10, that could slightly increase the overall latency but the benefit are non-negligible on time-based indices.
- Reduce the default only for queries that can use the partial reduce to speed up subsequent shard search (sorted queries on time based indices).
- Change the threshold to check the size of the buffered results rather than the absolute number of shards. We could also trigger partial reduce based on inactivity (if no more shard responded in the last N seconds/minutes).
- Keep the default as it is :(
I am curious to hear your thoughts on these options.