Throw custom RecallError/RecallException when the number of requested neighbors cannot be returned #88
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
During probabilistic construction of a Voyager index, it's possible that the graph becomes disconnected. During node insertions, some nodes have to prune neighbor nodes via a neighbor selection heuristic to maintain a maximum number of neighbors limit. A node can be pruned by all of its neighbors. This causes the graph to become multiple components.
The maximum number of neighbors that can be returned in a query is the size of the component that the entry point is in. Example: querying for
k
neighbors wherek
== size of index is not possible in a disconnected graph because not all nodes can be traversed.Using a higher
M
value to construct the index will improve recall because this parameter controls the number of neighbors a node can have. Allowing more neighbors per node results in a lower probability of a disconnected graph. Note that a higherM
value also increases construction time.Related Issues
#38
Changes Made
C++
RecallError
when the number of requested neighbors cannot be returned.M
value to increase the recall performance.Python
RecallError
to a Python bindingsRecallError
voyager.RecallError: Fewer than expected results were retrieved; only found 10584356 of 10779975 requested neighbors. Reconstruct the index with a higher M value to increase recall.
Java
com.spotify.voyager.jni.exception.RecallException
class to be thrown in Java when the native code throws aRecallError
.Exception com.spotify.voyager.jni.exception.RecallException: Fewer than expected results were retrieved; only found 10584356 of 10779975 requested neighbors. Reconstruct the index with a higher M value to increase recall.
Testing
Checklist
Additional Comments