-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Concurrent Segment Search] Mechanism to have custom slice calculation for each shard #7358
Comments
Created an issue in Lucene to support providing custom |
Background:For concurrent segment search, lucene by default computes the slices or number of tasks that can be executed concurrently for a search request. The default lucene implementation for slice computation uses the limit of 5 segments and 250K documents (whichever is met first) to decide on all the segments that will be processed per slice and hence also number of total slices that gets generated. For example: Consider an index foo which has 11 segments such that first 5 segments have 300K documents and last 6 segments have 1K document each. For this index, lucene will compute total of 7 slices such that first 5 slices will have 1 segment each (since they meet the 250K document count limit), next 5 segments will be in slice 6 (since they meet the segment count limit) and last segment will be in its own slice. Problem Statement:From the production usage we can see that usual shard size of the indices ranges between 20-40GB. Shard of this size can have segment counts ranging between 15-45 after the regular background merges are completed. Some segments will be of size > 1GiB, few will range in 100s of MiB whereas trailing ones will be in single digit MiBs. Due to large segment count and also large document count per segment, the number of slices computed by default lucene mechanism can be very large. For example: I used nyc_taxis dataset to generate a single shard index which is of size 23.9GB and that has 19 segments and the lucene computed slice count for this index is 15 per search request. It seems like Lucene tends to favor parallelizing a single shard as much as possible based on above parameters which can work well if a node is hosting that single shard or is not receiving multiple concurrent request. In those cases, each search request will have 15 slices to execute and that can create contention among the tasks across requests and increase the tail latency. Proposal:To deal with above problem, I am proposing a mechanism to control the maximum number of slices to be generated. We can start with a cluster level setting to configure that. Based on the set maximum slice, the segments for each request will be divided among those in round robin fashion after being sorted on doc counts. This will help to control the slices or concurrency per request depending upon the workload and use cases. For workload where search request concurrency is already high the lower value of slice count can be useful whereas with lower concurrency environment larger value of slice count can be useful. I did some benchmark based on this and below are results. This mechanism will also be useful to control the resource usage per request by controlling the slice count (or parallelization). PoC implementation: https://github.com/sohami/OpenSearch/tree/cs-custom-slice Caveats:In Lucene 9.7, it doesn’t have a mechanism to dynamically control the slice computation at each request level. To support that I worked on this PR: apache/lucene#12374 which will be available in next release. Until the next release is out we can still provide the custom slice computation mechanism using a static setting for now. When we get the change in next lucene release we should be able to provide dynamic mechanism to control this slice count by the users and can also be extended as an index setting. Setup:
Run with 2 search clients
Run with 4 search clients
Next Steps:I only see |
Currently lucene provides default mechanism of slice computation when concurrent model is used. It uses 5 segments or 250k documents as limits to assign segments to each slice. This slice computation defines number of concurrent tasks which a query phase will be executed with. This task is to provide mechanism to dynamically change the slice computation such that it can be customized for different workloads. For more context see #6798
The text was updated successfully, but these errors were encountered: