Skip to content

Add a DoubleValuesSource for scoring full precision vector similarity #14708

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jun 17, 2025

Conversation

vigyasharma
Copy link
Contributor

A DoubleValuesSource that scores on full precision vectors can be used on top of quantized knn vector queries to rerank with full precision similarity scores.

This change adds a 'full precision' similarity mode to existing VectorSimilarityValuesSources.
Supports #14009

Copy link

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR.

@msokolov
Copy link
Contributor

msokolov commented May 28, 2025

+1 to add support for full precision re-ranking. Have you considered writing a FullPrecisionVectorSimilaritySource as a separate class? We like to avoid conditional logic on boolean parameters where possible. I don't know if there is really a need for a byte-flavored version either? For the Query support we can expect the query to be supplied with high precision.

During indexing wouldn't we want to re-rank using non-quantized query vectors as well as full-precision document vectors? Not sure if that can be solved using a DoubleValuesSource - we would probably need to bake that in to the codec/hnswsearcher etc. I'm not sure if that is envisioned, anyway, and it makes sense to start with the Query-side.

@benwtrent
Copy link
Member

Have you considered writing a FullPrecisionVectorSimilaritySource as a separate class?

A separate class would allow users to provide a custom vector comparator, which might be beneficial.

But I agree, a new similarity source is a good idea here!

@vigyasharma
Copy link
Contributor Author

Thanks for the review folks! I like the idea of a separate class and a custom vector comparator, will make these changes.

@vigyasharma
Copy link
Contributor Author

I'm not sure about the byte vector case myself. Do we see a viable need for it in FullPrecisionVectorSimilaritySource ?

@benwtrent
Copy link
Member

I'm not sure about the byte vector case myself. Do we see a viable need for it in FullPrecisionVectorSimilaritySource ?

I am not sure. As of right now, none of the quantization schemes support vectors that are already bytes.

But, for future proofing, I think the name of the new similarity source should indicate the appropriate vector type right? Maybe we support byte in the future?

@Override
public float score() throws IOException {
return vectorSimilarityFunction.compare(
queryVector, vectorValues.vectorValue(iterator.index()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the possible concern for the reranking is that calling this will page in the full vector data (or what is being used in the iterator), and can compete the RAM with normal vectors used in HNSW graph search. As HNSW will suffer the performance if the vectors are not in RAM, I'm wondering if we can restrict the memory used by the re-ranking phase. Or alternatively, using mlock (could be tricky in Java) to lock in the normal vectors page.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be done separately too. I'm just curious what you think about this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The answer here is directIO for ranking. mlock feels tricky to implement well in Java land.

But I don't think this concern should block this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a valid concern for setups with limited memory.

As HNSW will suffer the performance if the vectors are not in RAM, I'm wondering if we can restrict the memory used by the re-ranking phase.

Maybe.. I wonder how we decide that the pages used for HNSW search are more important than pages used for FP reranking. For an application which does KNN search and reranks via full precision vectors, a query doesn't really complete until both phases are done. Wouldn't thrashing queries during reranking add to overall query latency. Maybe this is okay if you were reranking for only a subset of queries, and the vast majority is still only HNSW search, but that seems very use-case specific. Might be best to let OS page cache handle this?

Anyway, I think this deserves it's own discussion, perhaps in a separate issue? And as you and others have already mentioned, it can be handled independent of this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do remember reading some results on the DiskANN issue where benchmarks indicated that having the vectors needed for ANN graph search in memory (the quantized vectors in this case), does lead to better performance. So maybe, an option to use only DIRECT_IO for this makes sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm putting #14746 for further discussion on this topic.

Copy link

github-actions bot commented Jun 2, 2025

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR.

@github-actions github-actions bot added this to the 10.3.0 milestone Jun 2, 2025
@vigyasharma
Copy link
Contributor Author

Moved full precision scores logic to a separate FullPrecisionFloatVectorSimilarityValuesSource that can take a custom vector similarity function.

Copy link
Contributor

@dungba88 dungba88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you

@vigyasharma
Copy link
Contributor Author

Thanks all for reviewing this PR. I've addressed the comments above. Will merge this early next week, unless there is any new feedback.

@vigyasharma vigyasharma merged commit 72a655c into apache:main Jun 17, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants