-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add a DoubleValuesSource for scoring full precision vector similarity #14708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR. |
+1 to add support for full precision re-ranking. Have you considered writing a FullPrecisionVectorSimilaritySource as a separate class? We like to avoid conditional logic on boolean parameters where possible. I don't know if there is really a need for a byte-flavored version either? For the Query support we can expect the query to be supplied with high precision. During indexing wouldn't we want to re-rank using non-quantized query vectors as well as full-precision document vectors? Not sure if that can be solved using a DoubleValuesSource - we would probably need to bake that in to the codec/hnswsearcher etc. I'm not sure if that is envisioned, anyway, and it makes sense to start with the Query-side. |
A separate class would allow users to provide a custom vector comparator, which might be beneficial. But I agree, a new similarity source is a good idea here! |
Thanks for the review folks! I like the idea of a separate class and a custom vector comparator, will make these changes. |
I'm not sure about the byte vector case myself. Do we see a viable need for it in FullPrecisionVectorSimilaritySource ? |
I am not sure. As of right now, none of the quantization schemes support vectors that are already bytes. But, for future proofing, I think the name of the new similarity source should indicate the appropriate vector type right? Maybe we support byte in the future? |
@Override | ||
public float score() throws IOException { | ||
return vectorSimilarityFunction.compare( | ||
queryVector, vectorValues.vectorValue(iterator.index())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the possible concern for the reranking is that calling this will page in the full vector data (or what is being used in the iterator), and can compete the RAM with normal vectors used in HNSW graph search. As HNSW will suffer the performance if the vectors are not in RAM, I'm wondering if we can restrict the memory used by the re-ranking phase. Or alternatively, using mlock
(could be tricky in Java) to lock in the normal vectors page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be done separately too. I'm just curious what you think about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The answer here is directIO for ranking. mlock feels tricky to implement well in Java land.
But I don't think this concern should block this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a valid concern for setups with limited memory.
As HNSW will suffer the performance if the vectors are not in RAM, I'm wondering if we can restrict the memory used by the re-ranking phase.
Maybe.. I wonder how we decide that the pages used for HNSW search are more important than pages used for FP reranking. For an application which does KNN search and reranks via full precision vectors, a query doesn't really complete until both phases are done. Wouldn't thrashing queries during reranking add to overall query latency. Maybe this is okay if you were reranking for only a subset of queries, and the vast majority is still only HNSW search, but that seems very use-case specific. Might be best to let OS page cache handle this?
Anyway, I think this deserves it's own discussion, perhaps in a separate issue? And as you and others have already mentioned, it can be handled independent of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do remember reading some results on the DiskANN issue where benchmarks indicated that having the vectors needed for ANN graph search in memory (the quantized vectors in this case), does lead to better performance. So maybe, an option to use only DIRECT_IO for this makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm putting #14746 for further discussion on this topic.
08dccad
to
07d44de
Compare
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR. |
Moved full precision scores logic to a separate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you
lucene/core/src/test/org/apache/lucene/search/TestQuantizedVectorSimilarityValueSource.java
Outdated
Show resolved
Hide resolved
0ae1316
to
262a6c2
Compare
Thanks all for reviewing this PR. I've addressed the comments above. Will merge this early next week, unless there is any new feedback. |
A
DoubleValuesSource
that scores on full precision vectors can be used on top of quantized knn vector queries to rerank with full precision similarity scores.This change adds a 'full precision' similarity mode to existing
VectorSimilarityValuesSource
s.Supports #14009