-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Introduce getQuantizedVectorValues method in LeafReader to access QuantizedByteVectorValues #14792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ntizedByteVectorValues
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
@Pulkitg64 I don't understand how this is part of #13158 I would have thought the APIs stay the same. Quantization should be able to "rehydrate" the quantized vectors into floating point (or whatever the original values). So, the segment, depending on what data it has access to, will:
Either way, users should still be able to call I would think there is a sub-class called But maybe we add an But it is likely useless for the user to have access to the quantized bytes directly as they don't provide much value without knowing how to use them. |
Thanks @benwtrent for the quick review and comments. Regarding your comment about how this relates to issue #13158 - I agree in a way this PR doesn't directly help create a "read-only" index as mentioned in the issue. Let me clarify the motivation: This PR addresses a scenario where:
Currently, there's no way to directly access quantized vectors - we can only access raw vectors. But if raw vectors are dropped, this causes errors. This PR adds methods to access ByteQuantizedVectors in such cases. As for the usefulness of accessing quantized bytes directly - we have specific use cases, such as returning the vectors themselves when requested in a query. Please let me know your thoughts. Regarding accessing quantized vectors directly - we could also consider using the QuantizedVectorValues class, which is currently returned by the getFloatVectorValues method. While this class wraps both raw and quantized vectors, its members are private, preventing direct access to the quantized vectors like we're doing in this PR. Would it make more sense to make the relevant members public in QuantizedVectorValues rather than adding getQuantizedVectorValues to LeafReader? |
I would assume the caller would want something akin to the I am saying that returning the quantized bytes, without knowing all the other information (quantized technique, the technique's parameters, etc.) is pretty useless. |
We are experimenting with large vector indexes, and since (raw unquantized) vectors consume significant disk space (4x more than quantized vectors), we want to drop the raw vectors from searcher machines. We are currently using vector values for below use cases:
For use case 1 we have started to use vectorScorer which use quantized vectors for computing score so we are good there. For use cases 2 and 3, we currently use floatVectorValues using getFloatVectorValues but need to switch to quantizedVectorValues since searchers won't have float vectors anymore and we are okay in accepting the accuracy loss from float-to-byte quantization. To address these use cases, we have two options:
I would like to know your thoughts on whether we should create such an API, and if you think the above use cases don't justify a new API, what are your thoughts on implementing the workaround solution and pushing it upstream? |
Why would you need to do this? Generally, I would assume that any access to the vector would be "Give me what I gave you", and the best we can do with quantized vectors is the dequantized vector. I don't fully understand how serializing a read-only segment that is missing files (e.g. missing the "vec" file), but the format should do the right thing and see that the file isn't there and provide an approximate view of the floating point vectors.
I don't understand what this means really. Just counting how many vectors there are? This should be doable via the
Again, I think we should do the nice thing, de-quantize the vectors as the user asks for them. It should fully satisfy the Getting access to the raw quantized bytes is basically useless without all the other parameters that were used to quantized the vector. |
Basically, I don't think callers should know if they are hitting quantized vectors or raw. Or at least have to make that decision up front. Requiring the user to pick the right thing seems unnecessary when we have the appropriate interfaces already. Its just all about determining how the format itself knows that its missing the |
For the return values use case, another choice is to disable it in the case the original vectors were not "stored" in the searchable index. Otherwise, I agree with Ben that we could support "rehydration" in the codec. For example, suppose we see that we have zero full-precision vectors, but nonzero quantized vectors; then we could fall back to "rehydration". For the counting case (get total number of vectors), should we always use the quantized count where today we use the full-precision count? |
Description
Introduce
getQuantizedVectorValues
method inLeafReader
to accessQuantizedVectorValues
.In a search architecture where searchers and writer runs on separate machine, it is wasteful to have raw float vectors on machine when vector quantization enabled. This PR is adding getQuantizedVectorValues in LeafReader which will help to read QuantizedByteVectors directly without need of reading raw float vectors.
Partially solving #13158