-
Couldn't load subscription status.
- Fork 2.3k
[Approximation Framework] Extending approximation to top-level term queries on numeric fields #18679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Approximation Framework] Extending approximation to top-level term queries on numeric fields #18679
Changes from all commits
d1ce965
dc161bc
18d5180
74ff757
83f47e3
0fca1da
1c21092
ef56a28
03936ee
5fce0a8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -36,6 +36,7 @@ | |
| import org.opensearch.search.sort.SortOrder; | ||
|
|
||
| import java.io.IOException; | ||
| import java.util.Arrays; | ||
| import java.util.Objects; | ||
| import java.util.function.Function; | ||
|
|
||
|
|
@@ -392,7 +393,12 @@ public long cost() { | |
|
|
||
| @Override | ||
| public Scorer get(long leadCost) throws IOException { | ||
| intersectRight(values.getPointTree(), visitor, docCount); | ||
| // Force term queries with desc sort through intersectLeft | ||
| if (Arrays.equals(pointRangeQuery.getLowerPoint(), pointRangeQuery.getUpperPoint())) { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How does it work with ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Today we can run term with sort, I want to make sure the results does not break, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not a good change There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This change doesn't affect term queries where the primary sort field is different from the term query's field. In ApproximatePointRangeQuery's canApproximate, this statement gets executed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since there is a singular unique point value in a numeric term query and the tiebreaker ends up being the docID, using intersectLeft results in the same top-k results as intersectRight. I'm also fine in disabling approximation for this specific case (term query with descending sort) due to performance concerns (queue has more comparisons to make) until intersectRight logic can be modified to return accurate hits for these kinds of queries. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this is risky, can leave out this change and create an issue for it? I'm to merge in incremental changes, given there's enough test coverage. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, that's what I was thinking as well. For the time being, we could disable approximation and send desc sorts on top level term queries through Lucene while this edge case is being sorted out. |
||
| intersectLeft(values.getPointTree(), visitor, docCount); | ||
| } else { | ||
| intersectRight(values.getPointTree(), visitor, docCount); | ||
| } | ||
| DocIdSetIterator iterator = result.build().iterator(); | ||
| return new ConstantScoreScorer(score(), scoreMode, iterator); | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we dont change anything in
ScaledFloatFieldMapperbecause thetermQueryuses theNumberFieldMapper.NumberType.LONG.termQueryThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup,
ScaledFloatFieldMapperimplements atermQuerythat usesNumberFieldMapper.NumberType.LONG.termQuery()which would fall under approximation.