Conversation
…gies Add StrategyPreference enum to control how the engine searches the document tree. This allows users to choose between different retrieval approaches including AUTO, KEYWORD, LLM, HYBRID, CROSS_DOCUMENT, and PAGE_RANGE strategies. The new StrategyPreference can be used with QueryContext to force specific retrieval behaviors: - KEYWORD: Fastest option with no LLM calls during search - LLM: Most accurate with deep reasoning capabilities - HYBRID: BM25 + LLM refinement approach - CROSS_DOCUMENT: Multi-document retrieval - PAGE_RANGE: Filter by page range - AUTO: Default behavior that auto-selects based on query complexity
…/MCTS Add Pure Pilot search algorithm that uses LLM guidance to pick the best child at each layer. Integrate Pilot scoring into beam search and MCTS with caching mechanism to avoid redundant LLM calls. - Add PurePilotSearch algorithm with 1.0 weight for Pilot scoring - Rename GreedySearch to PurePilotSearch and update implementation - Modify beam search to use Pilot as primary scorer with 0.7 weight - Enhance MCTS with Pilot-provided priors in UCT formula and guided simulation phase - Add PilotDecisionCache to prevent repeated LLM calls for same contexts - Update SearchAlgorithm enum with PurePilot variant and rename others - Add search_fallback_chain to PipelineContext for ordered algorithm execution BREAKING CHANGE: GreedySearch renamed to PurePilotSearch
BREAKING CHANGE: The config.example.toml file has been removed as it's
no longer needed.
feat(llm): rename summary client to index client for clarity
The summary client has been renamed to index client to better reflect
its purpose during document indexing operations. The old 'summary'
configuration still works as an alias for backward compatibility.
feat(retrieval): add search fallback chain configuration
Add configurable fallback chain for search algorithms that tries
different algorithms ("beam", "mcts", "pure_pilot") in order when
minimum score thresholds aren't met.
…iles - Remove unnecessary blank lines and trailing spaces - Consolidate multi-line variable declarations into single lines where appropriate - Reorder imports to follow standard conventions refactor(engine): improve code readability in engine implementation - Format long method chains with proper indentation - Break down complex expressions into readable blocks - Clean up error message formatting refactor(indexer): enhance code formatting in indexing components - Standardize multi-line function calls and method chaining - Improve readability of complex operations - Consolidate redundant blank lines refactor(retriever): clean up retriever and related modules - Format long expressions and method calls consistently - Remove unused imports and declarations - Improve code organization in TOC processing modules refactor(llm): streamline LLM executor and pool implementations - Clean up error messages and string formatting - Improve readability of conditional statements - Standardize async method calls refactor(search): restructure search algorithm implementations - Format complex calculations and expressions clearly - Remove unused imports and exports - Clean up test cases and remove obsolete tests
- Create _graph.bin with document structure containing sample data - Add nodes with document ID, title, format and top keywords - Include keyword index mapping for efficient search - Initialize empty meta.bin for future metadata storage
…tic fallback Add comprehensive query complexity detection system that uses LLM classification when available, falling back to heuristic rules. Supports both English and Chinese queries with improved word counting for CJK characters. The complexity detector now accepts an optional LLM client for accurate classification while maintaining backward compatibility with rule-based detection. - Add LLM-based complexity detection using pilot's LLM client - Implement heuristic fallback with enhanced keyword matching - Support Chinese language complexity indicators - Add proper CJK character word counting estimation - Update analyze stage to use LLM-enhanced complexity detection - Create new pilot complexity module with JSON response parsing - Include comprehensive test coverage for both approaches
Removed several test functions from the ComplexityDetector that were no longer needed: - test_medium_queries - test_estimate_word_count - test_no_llm_is_ok These tests were related to query complexity detection functionality and word counting utilities that are no longer part of the current implementation. refactor(data): clean up example workspace files Removed binary files _graph.bin and meta.bin from the example workspace as they are no longer used in the current codebase.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.