Improve collection handling and hashing consistency#203
Merged
Conversation
Added robust parsing for SCHEMA_CACHE_TTL in indexing_admin.py to handle invalid environment values. Updated upload_service.py to better support demo mode and fallback to Qdrant collections when authentication is disabled. Modified watch_index_core/processor.py to use xxhash for file hashing for consistency with the pipeline, with a fallback to hashlib if xxhash is unavailable.
🤖 Augment PR SummarySummary: Improves robustness around schema-cache TTL parsing, admin collection handling in demo mode, and watcher file hashing consistency. Changes:
🤖 Was this summary useful? React with 👍 or 👎 |
Added validation for SCHEMA_CACHE_TTL_SECS to ensure it is finite and positive, defaulting to 300 seconds if invalid. Enhanced file hash computation to fallback to hashlib.sha1 on any xxhash runtime error, improving robustness.
Collaborator
Author
|
augment review |
Removed fallback to hashlib and now require xxhash for hashing file contents in _read_text_and_sha1. This ensures consistency with other scripts and simplifies the code by eliminating error handling for missing xxhash.
Renamed _read_text_and_sha1 to _read_text_and_hash and updated its implementation and usage to compute xxhash64 instead of SHA1 for consistency with the pipeline. Updated related test to check for the correct hash length.
Introduces optional LZ4 compression for JSON data stored in Redis, reducing memory usage with minimal CPU overhead. Updates requirements.txt to include lz4 as a dependency and modifies workspace_state.py to compress data before storing and decompress on retrieval, falling back gracefully if lz4 is unavailable.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Added robust parsing for SCHEMA_CACHE_TTL in indexing_admin.py to handle invalid environment values. Updated upload_service.py to better support demo mode and fallback to Qdrant collections when authentication is disabled. Modified watch_index_core/processor.py to use xxhash for file hashing for consistency with the pipeline, with a fallback to hashlib if xxhash is unavailable.