-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Semantic_text match_all with Highlighter #128702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Semantic_text match_all with Highlighter #128702
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice start, I've added some high level comments on overall organization. We can do a deeper review when it's closer to ready.
.../java/org/elasticsearch/xpack/inference/queries/SemanticMatchAllQueryRewriteInterceptor.java
Outdated
Show resolved
Hide resolved
.../java/org/elasticsearch/xpack/inference/queries/SemanticMatchAllQueryRewriteInterceptor.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/elasticsearch/xpack/inference/queries/SemanticQueryRewriteInterceptor.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a much simpler way to fix this by updating the semantic highlighter. Take a look at how you can change QueryVisitor
in extractDenseVectorQueries
and extractSparseVectorQueries
: https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/highlight/SemanticTextHighlighter.java#L250
@elasticmachine update branch |
Pinging @elastic/search-eng (Team:SearchOrg) |
Pinging @elastic/search-relevance (Team:Search - Relevance) |
Hi @Samiul-TheSoccerFan, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for these changes, looks great @Samiul-TheSoccerFan !
docs/changelog/128702.yaml
Outdated
@@ -0,0 +1,5 @@ | |||
pr: 128702 | |||
summary: Highlighting support added to semantic_text with `match_all` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: Since this is a bug, perhaps we call the changelog summary something like `Fix issue where highlighting was not returned with match_all queries"
@@ -267,6 +268,8 @@ public void visitLeaf(Query query) { | |||
queries.add(fieldType.createExactKnnQuery(VectorData.fromFloats(knnQuery.getTargetCopy()), null)); | |||
} else if (query instanceof KnnByteVectorQuery knnQuery) { | |||
queries.add(fieldType.createExactKnnQuery(VectorData.fromBytes(knnQuery.getTargetCopy()), null)); | |||
} else if (query instanceof MatchAllDocsQuery) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😍 Nice optimization from the first solution!
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach looks good! Caught a copy/paste error in the tests that we should fix though.
...nce/src/yamlRestTest/resources/rest-api-spec/test/inference/90_semantic_text_highlighter.yml
Outdated
Show resolved
Hide resolved
...src/yamlRestTest/resources/rest-api-spec/test/inference/90_semantic_text_highlighter_bwc.yml
Outdated
Show resolved
Hide resolved
...nce/src/yamlRestTest/resources/rest-api-spec/test/inference/90_semantic_text_highlighter.yml
Outdated
Show resolved
Hide resolved
...src/yamlRestTest/resources/rest-api-spec/test/inference/90_semantic_text_highlighter_bwc.yml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Good clean solution and tests :)
@elasticmachine update branch |
@elasticmachine update branch |
💔 Backport failed
You can use sqren/backport to manually backport by running |
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
* initial implementation for match_All * reformat * [CI] Auto commit changes from spotless * Excluding matchAllintercepter * Adding matchAllDocs support for vector fields * [CI] Auto commit changes from spotless * Remove previous implementation * Adding yaml tests for match_all * fixed yaml tests * Update docs/changelog/128702.yaml * Update changelog * changelog - update summary * Fix wrong inference names for the yaml tests --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit d1b5532) # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java
* initial implementation for match_All * reformat * [CI] Auto commit changes from spotless * Excluding matchAllintercepter * Adding matchAllDocs support for vector fields * [CI] Auto commit changes from spotless * Remove previous implementation * Adding yaml tests for match_all * fixed yaml tests * Update docs/changelog/128702.yaml * Update changelog * changelog - update summary * Fix wrong inference names for the yaml tests --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> (cherry picked from commit d1b5532) # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java
This PR addresses the missing
highlighting
for semantic_text fields when amatch_all
query is used. We guide users to use semantic highlighting to view their chunks, but highlighting does not currently return any results withmatch_all
. Highlighting is usually unnecessary for non-inference fields withmatch_all
, but it's essential forsemantic_text
fields to return the generated chunks. This change ensures those highlighting results are returned as expected.Test cases:
Only text fields
only inference fields
Both infer and non-inference fields
Both dense vector fields
Some additional cases