-
Couldn't load subscription status.
- Fork 2.3k
Enabling Intra Segment Concurrent Search for only queries without aggregations #18879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
❌ Gradle check result for 76f81c6: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 5721e4f: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 2bb652b: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: expani <anijainc@amazon.com>
Signed-off-by: expani <anijainc@amazon.com>
Signed-off-by: expani <anijainc@amazon.com>
Signed-off-by: expani <anijainc@amazon.com>
Signed-off-by: expani <anijainc@amazon.com>
|
❌ Gradle check result for 1f3de07: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: expani <anijainc@amazon.com>
|
❌ Gradle check result for 6115def: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: expani <anijainc@amazon.com>
|
❌ Gradle check result for 39dd2ee: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
| && Boolean.TRUE.equals(requestShouldUseConcurrentIntraSegmentSearch.get()); | ||
| } | ||
|
|
||
| public void evaluateRequestShouldUseIntraSegmentConcurrentSearch() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this method be private?
Also, can we refactor it better to avoid multiple if-else:
private void evaluateRequestShouldUseIntraSegmentConcurrentSearch() {
// Check for any condition that forces concurrency to be disabled.
final boolean isUnsupported = (sort != null && sort.isSortOnTimeSeriesField())
|| (aggregations() != null)
|| (terminateAfter != DEFAULT_TERMINATE_AFTER)
|| (trackTotalHitsUpTo != TRACK_TOTAL_HITS_DISABLED); // TODO: Need to handle TotalHitCountCollectorManager
// Check for explicitly enables concurrency.
final boolean isEnabledByMode = concurrentIntraSegmentSearchMode.equals(CONCURRENT_SEGMENT_SEARCH_MODE_ALL); // TODO: Handle auto mode here
requestShouldUseConcurrentIntraSegmentSearch.set(isEnabledByMode && !isUnsupported);
}
|
Few comments: partition_size setting is confusing. Why not derive optimal partitions from segment size/thread pool capacity? Fixed partition sizes don't adapt to varying segment characteristics. Disabling for track_total_hits=true makes sense initially, but this severely limits adoption. Most production queries need hit counts. It would be good to have a concrete plan for TotalHitCountCollector support. Consider unified namespace: partition_size as integer is problematic. Segments vary 1KB to 10GB. Consider one of the following:
Conditional logic in QueryPhase is getting complex. Extract strategy pattern: interface SearchExecutionStrategy {
boolean canExecute(SearchContext context);
void execute(SearchContext context) throws IOException;
}Thread safety is unclear for intra-segment collectors. Please document collector state isolation guarantees. Are FieldDoc comparisons thread-safe across partition boundaries? Benchmarks providing the following data would be good:
How are results combined from N partitions? If linear merge, this negates benefits for large result sets. We should do heap-based merge for top-K queries. What happens when partition_size > segment doc count? We should gracefully degrade to single-threaded execution, not fail or create empty partitions. Sorting exclusion logic is incomplete. Only checks time series fields, but concurrent search breaks with any custom scoring/sorting that depends on document order across segment boundaries. terminateAfter interaction seems to be problematic. If terminateAfter=100 and partition_size=10, each partition terminates at 10 docs, returning 100 total instead of intended 100 limit. We should have global termination coordination. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refer to: #18879 (comment)
It is indicative of the smallest segment partition size below which splitting it would not be performant. Consider, a segment with 10 docs, we wouldn't want to split it to 3 partitions with [ 4, 3, 3 ] docs. Currently, the default of
See #18854 for the same.
This is unclear. There are no changes in QueryPhase
We would never create empty partitions. It is handled in the code.
These are existing limitations of Concurrent Segment Search which can be handled separately instead of tying it up with this PR.
Every segment's partition will have it's own collector. Any collector sharing state is disabled like TotalHitCount and others have been kept as ConcurrentSegmentSearch. Let me know if you had any specific one in mind.
Partition size is to ensure we don't split a segment below a certain threshold. It doesn't indicate that every segment is split at that specific size. Currently, we use LPT ( Longest Processing Time ) algorithm with Concurrent Segment Search which would benefit with small segment sizes for achieving equal distribution. Consider, a segment with 1_000_000 docs and a partition size of 1000 and 3 slices, after distribution it would end up as [340_000, 333_000, 333_000 ] |
The design seems sound - had these followups:
|
I also prefer to derive the partition size as a function of |
@jainankitk The primary question we need to answer is what's the smallest size that a partition of a segment can have. It would only make sense to partition a segment if the Time(Create Collector, BulkScorer, DISI) << Time(Score/Match the Documents) which is very dependent on the complexity of query. With aggregations, this adds another layer of complexity. Percentage based can still allow very small partition sizes if a shard only has small segments. So, we need to get a sensible default from our existing known workloads like Big5, Http Logs; etc and still give control to the End Users via Cluster setting, index setting and execution hint.
The primary goal is to reduce the overall query execution time and make it better than Concurrent Segment Search. I am planning to ditch this PR in favor of a different one where I will expand more on these lines and give some psuedo code for the implementation before diving into code. Will post in #18851 |
One collector is created per Slice and the thread in
Aggregations are disabled and TotalHitCount is disabled which are the stateful collectors that I am aware of at the moment. If you feel there is some missing case, do let me know.
That's a good point, you mean via plugins extending Will add the same in #18851 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am planning to ditch this PR in favor of a different one where I will expand more on these lines and give some psuedo code for the implementation before diving into code. Will post in #18851
Since we don't want to continue this PR in favor of slightly different approach, can we move this PR in a draft state?
Description
Enabling Intra Segment Concurrent Search for queries without aggregations.
Introduced Cluster and Index level settings similar to Concurrent Segment Search.
Additionally, there is an extra cluster/index setting named
partition_sizewhich will control how large segments should be sliced/partitioned.Related Issues
Resolves #18851
Partially resolves #18849 as we are not introducing Auto Mode and an execution hint with this PR.
Another limitation with the current PR is Intra Segment Search will only work if
track_total_hitsis disabled. Will follow up in a separate PR. It needs handling of TotalHitCountCollector similar to how Lucene did Wrote more about this in #18854