-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] updates & invalid operations should also trigger persisting of local HNSW #2499
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
with_persistent_hnsw_params=True, | ||
# By default, these are set to 2000, which makes it unlikely that index mutations will ever be fully flushed | ||
max_hnsw_sync_threshold=10, | ||
max_hnsw_batch_size=10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without reducing the sync threshold, this test will almost never fully flush to disk, which makes the correctness guarantee weaker
without this change, the pickled metadata file was never loaded
@@ -336,6 +344,10 @@ def test_cycle_versions( | |||
name=collection_strategy.name, | |||
embedding_function=not_implemented_ef(), # type: ignore | |||
) | |||
|
|||
# Should be able to add embeddings | |||
coll.add(**embeddings_strategy) # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
causes the pickled metadata file to be loaded, a good thing to test since the class def changed and it needs special handling
9126437
to
b8268b7
Compare
Didn't catch after a refactor in #2499. This is not a correctness bug, this would have resulted in a persist/flush after every mutation if enough invalid operations had accumulated.
The benefits aren't immediately clear in this PR, but this allows future work to correctly track the maximum sequence ID.
Depends on #2511.