Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parent join support for lucene knn #1182

Merged
merged 1 commit into from
Oct 10, 2023

Conversation

heemin32
Copy link
Collaborator

@heemin32 heemin32 commented Oct 2, 2023

Description

Call DiversifyingChildren[Byte|Float]KnnVectorQuery for nested field so that k number of parent document can be returned in search result

Issues Resolved

#1065

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@navneet1v
Copy link
Collaborator

@heemin32 can you add an example how you tested this change?

@heemin32
Copy link
Collaborator Author

heemin32 commented Oct 3, 2023

@heemin32 can you add an example how you tested this change?

Added test scenario in #1065

Call DiversifyingChildren[Byte|Float]KnnVectorQuery for nested field so that k number of parent document can be returned in search result

Signed-off-by: Heemin Kim <heemin@amazon.com>
@codecov
Copy link

codecov bot commented Oct 3, 2023

Codecov Report

Merging #1182 (9cc75c7) into main (78aba55) will increase coverage by 0.06%.
Report is 1 commits behind head on main.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##               main    #1182      +/-   ##
============================================
+ Coverage     84.95%   85.01%   +0.06%     
- Complexity     1190     1210      +20     
============================================
  Files           159      160       +1     
  Lines          4838     4932      +94     
  Branches        440      449       +9     
============================================
+ Hits           4110     4193      +83     
- Misses          530      540      +10     
- Partials        198      199       +1     
Files Coverage Δ
...n/java/org/opensearch/knn/common/KNNConstants.java 94.11% <ø> (ø)
...rg/opensearch/knn/index/query/KNNQueryFactory.java 86.27% <100.00%> (-1.45%) ⬇️

... and 6 files with indirect coverage changes

@@ -86,10 +89,12 @@ public static Query create(CreateQueryRequest createQueryRequest) {
return new KNNQuery(fieldName, vector, k, indexName);
}

log.debug(String.format("Creating Lucene k-NN query for index: %s \"\", field: %s \"\", k: %d", indexName, fieldName, k));
BitSetProducer parentFilter = createQueryRequest.context == null ? null : createQueryRequest.context.getParentFilter();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

who is setting this parent filter in the CreateQueryRequest?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@navneet1v
Copy link
Collaborator

@heemin32 can we add benchmarks around this feature, what your thought?

@heemin32
Copy link
Collaborator Author

@heemin32 can we add benchmarks around this feature, what your thought?

That is a great idea. However, I would like to defer the effort until there is strong needs for the benchmark as we need to construct test data by ourselves for the nested field which might take some amount of efforts.

@heemin32 heemin32 merged commit a496926 into opensearch-project:main Oct 10, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Oct 10, 2023
Call DiversifyingChildren[Byte|Float]KnnVectorQuery for nested field so that k number of parent document can be returned in search result

Signed-off-by: Heemin Kim <heemin@amazon.com>
(cherry picked from commit a496926)
heemin32 added a commit to heemin32/k-NN that referenced this pull request Oct 11, 2023
Call DiversifyingChildren[Byte|Float]KnnVectorQuery for nested field so that k number of parent document can be returned in search result

Signed-off-by: Heemin Kim <heemin@amazon.com>
(cherry picked from commit a496926)
heemin32 added a commit to heemin32/k-NN that referenced this pull request Nov 7, 2023
Call DiversifyingChildren[Byte|Float]KnnVectorQuery for nested field so that k number of parent document can be returned in search result

Signed-off-by: Heemin Kim <heemin@amazon.com>
(cherry picked from commit a496926)
heemin32 added a commit to heemin32/k-NN that referenced this pull request Nov 7, 2023
Call DiversifyingChildren[Byte|Float]KnnVectorQuery for nested field so that k number of parent document can be returned in search result

Signed-off-by: Heemin Kim <heemin@amazon.com>
(cherry picked from commit a496926)
heemin32 added a commit that referenced this pull request Nov 7, 2023
Call DiversifyingChildren[Byte|Float]KnnVectorQuery for nested field so that k number of parent document can be returned in search result

Signed-off-by: Heemin Kim <heemin@amazon.com>
(cherry picked from commit a496926)

Co-authored-by: Heemin Kim <heemin@amazon.com>
@heemin32 heemin32 deleted the join branch July 18, 2024 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants