-
Notifications
You must be signed in to change notification settings - Fork 686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MB-57926 - Searcher for similarity search #1849
MB-57926 - Searcher for similarity search #1849
Conversation
8898a22
to
43dc5c1
Compare
index_impl.go
Outdated
searchQuery := req.Query | ||
if req.Similarity != nil { | ||
similarityQuery := query.NewSimilarityQuery(req.Similarity.Vector) | ||
similarityQuery.SetFieldVal(req.Similarity.Field) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking if it makes sense to have this as a set of fields instead, and what you're trying to do is apply the similarity search over each of those fields. So, you'd be getting the similar vectors from all these indexed fields and do a conjunction of these sub-similarity searchers and the user's non-similarity query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per discussion, the user can enter multiple fields, not just one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abhinavdangeti do you think it's a valid use case where the user enters multiple fields against 1 vector?
@Thejas-bhat mentioned that a user could have multiple fields with dense vector content and might want to query the same vector across them and hence, the need for multiple fields.
43dc5c1
to
67f4589
Compare
search.go
Outdated
@@ -282,9 +282,16 @@ type SearchRequest struct { | |||
SearchAfter []string `json:"search_after"` | |||
SearchBefore []string `json:"search_before"` | |||
|
|||
Similarity *SimilarityRequest `json:"similarity"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Similarity an appropriate name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be vector-specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use kNN everywhere - KNNSearcher, kNNMatch etc.
ecc4bd7
to
4f02605
Compare
4f02605
to
cddca18
Compare
46d693a
to
a3de9ff
Compare
7c112f4
to
b6fa9b5
Compare
b6fa9b5
to
c1d2697
Compare
@metonymic-smokey proposing a few changes around score/boost for searcher that will be relevant here in the spec. Let's catch up tomorrow. |
b7c4700
to
fc67d5f
Compare
} | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think you might need to check the default mapping for the path as well (if the type mappings don't have the path in it). just like how AnalyzerNameForPath(paht string) does
@@ -55,4 +55,6 @@ type IndexMapping interface { | |||
|
|||
AnalyzerNameForPath(path string) string | |||
AnalyzerNamed(name string) analysis.Analyzer | |||
|
|||
FieldMappingForPath(path string) *FieldMapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps its better to not return a pointer to the fieldMapping. mainly because i think this interface deals with reading of the content in the index mapping, and passing a pointer to the fieldMapping could lead to users having the option to modify the config of the field mapping at instances when they aren't allowed to do so (for example query time).
Closing in favor of #1869 |
No description provided.