-
Notifications
You must be signed in to change notification settings - Fork 0
feat: Standardize Query DSL and Enhance Adapter Architecture #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Remove wrapper types (UpsertRequest, SearchRequest, VectorStatus, UpsertInput) - Engine methods now return ABC types directly (VectorDocument instead of dicts) - Add helper methods: create_from_texts, upsert_from_texts, update_from_texts - Remove types.py - replace DocumentIds with Union[str, Sequence[str]] - Remove unused functions: normalize_documents, extract_unique_query - Remove Document and normalize_documents from public API exports - Add utils helpers: normalize_texts, normalize_metadatas, normalize_pks - Enhanced search with offset and where filtering across all adapters - Remove unique_fields parameter (only used by 1 of 4 adapters) - Add collection management: add_collection, get_collection, get_or_create_collection
- Updated Quick Start examples to use create_from_texts() helper - Added PRIMARY_KEY_MODE configuration docs - Fixed test fixtures to return dict with texts/metadatas/pks - Updated all test methods to use new API (no more wrapper types) - Removed test_flexible_input.py (tests removed internal functions) - Added missing ABC methods to MockDBAdapter (create, get_or_create, update, update_or_create) - Fixed normalize_pks() to pad list with None values - Fixed VectorDocument construction to use 'id' parameter instead of 'pk' All engine tests now pass with the simplified API.
- Rename VectorDocument class (backward compat alias maintained) - Remove SearchRequest/UpsertRequest wrappers - use direct method calls - Add private _vector attribute with emb property - Move generate_pk and helpers from schema to utils - Reorganize utils.py into logical sections - Update all docs to reflect new API and PK generation modes - Fix integration tests to use new engine methods - Delete obsolete test_schema.py
2d199e4 to
f1f264c
Compare
6c361b0 to
6d21ca6
Compare
- Add Logger class with configurable LOG_LEVEL setting - Replace all module loggers with Logger class - Use specific exceptions from exceptions.py instead of generic ValueError/Exception - Replace os.getenv with direct api_settings access - Update all adapters (astradb, chroma, milvus, pgvector) with consistent patterns
6d21ca6 to
ca65eff
Compare
- Add Query DSL with Q class supporting 8 universal operators - Implement backend-specific compilers for all databases - Enhance PgVector with nested JSONB and numeric casting - Add capability flags to adapters - Improve get_or_create/update_or_create with multi-step lookup - Standardize configuration settings - Add comprehensive Query DSL test suite - Remove deprecated test scripts
- Create tests/searches/ directory for backend integration tests - Move test_search_*.py to tests/searches/test_*.py for clarity - Add comprehensive README.md documenting test structure and requirements - Add __init__.py with package documentation - Tests now organized by functionality (searches) rather than mixed with unit tests
- Add scripts/tests/ with real backend integration tests - Add tests/mock/ with in-memory adapter for DSL testing - Fix Milvus operator mapping (IN/NOT IN uppercase) - Document opt-in test strategy in README.md - Remove deprecated tests/searches/ directory
Version 0.1.3 (2025-11-30): - Test infrastructure reorganization (scripts/tests/ + tests/mock/) - Query DSL improvements (Milvus operator fix) - CI/CD updates (unit tests only in GitHub Actions) - Documentation enhancements (README integration test guide) - Bug fixes (fixture imports, unused variables) Version 0.1.2 (2025-11-23): - Refactor design with architecture improvements - Enhanced Query DSL design patterns - Improved adapter interface consistency Bump version: 0.1.0 -> 0.1.3
Resolved conflict in pyproject.toml: - Keep version 0.1.3 from standard-dbs branch - Main branch had version 0.1.1
- Renamed tests/mock/test_common_mock.py → tests/test_querydsl_operators.py - Moved InMemoryAdapter and fixtures to tests/conftest.py for global availability - Removed tests/mock/ directory (no longer needed) - Fixed fixture import issues that were causing pytest to hang - All 77 unit tests passing (33% coverage)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Standardize Query DSL and Enhance Adapter Architecture
Summary
This PR introduces a comprehensive Query DSL system with universal operator support across all vector database backends, enhances adapter capabilities, and improves the engine's document retrieval logic.
Key Changes
🎯 Query DSL System
$eq,$ne,$gt,$gte,$lt,$lte,$in,$nin)field__subfield__operatorsyntax for nested metadata queries🔧 Adapter Enhancements
PgVector
#>>operator for dot-notation queries::numericfor type-safe comparisonsAll Adapters
supports_metadata_only,REQUIRES_VECTOR,SUPPORTS_NESTED🚀 Engine Improvements
⚙️ Configuration
CHROMA_HOSTinstead ofCHROMA_HTTP_HOST)VECTOR_DIMsetting for default embedding dimension🧪 Testing
test_querydsl.py,test_compilers.py,test_adapters_where.py)📚 Documentation
docs/querydsl.md- Complete Query DSL guidedocs/architecture.md- System design and patternsdocs/adapters/databases.md- Backend-specific capabilities🗑️ Cleanup
scripts/tests/directory (old integration test scripts)querydsl/filters/systemBreaking Changes
Configuration Keys
CHROMA_HTTP_HOST→CHROMA_HOSTCHROMA_HTTP_PORT→CHROMA_PORTCHROMA_CLOUD_TENANT→CHROMA_TENANTCHROMA_CLOUD_DATABASE→CHROMA_DATABASEMILVUS_USER+MILVUS_PASSWORD→MILVUS_API_KEYAPI Changes
whereparameter now acceptsQobjects or universal dict formatscorein metadata when availableChromaDBAdapter→ChromaAdapter,MilvusDBAdapter→MilvusAdapter,PGVectorAdapter→PgVectorAdapterMigration Guide
Updating Configuration
Using Query DSL
Adapter Imports
Testing
All tests passing:
pytest tests/ # 62 passedBackend Compatibility Matrix
Documentation
Complete documentation available:
Related Issues
Closes #[issue-number] (if applicable)
Checklist