Skip to content

Dev#61

Merged
zTgx merged 7 commits intomainfrom
dev
Apr 13, 2026
Merged

Dev#61
zTgx merged 7 commits intomainfrom
dev

Conversation

@zTgx
Copy link
Copy Markdown
Contributor

@zTgx zTgx commented Apr 13, 2026

No description provided.

zTgx added 7 commits April 13, 2026 20:56
…gies

Add StrategyPreference enum to control how the engine searches the
document tree. This allows users to choose between different
retrieval approaches including AUTO, KEYWORD, LLM, HYBRID,
CROSS_DOCUMENT, and PAGE_RANGE strategies.

The new StrategyPreference can be used with QueryContext to
force specific retrieval behaviors:

- KEYWORD: Fastest option with no LLM calls during search
- LLM: Most accurate with deep reasoning capabilities
- HYBRID: BM25 + LLM refinement approach
- CROSS_DOCUMENT: Multi-document retrieval
- PAGE_RANGE: Filter by page range
- AUTO: Default behavior that auto-selects based on query complexity
…/MCTS

Add Pure Pilot search algorithm that uses LLM guidance to pick the best
child at each layer. Integrate Pilot scoring into beam search and MCTS
with caching mechanism to avoid redundant LLM calls.

- Add PurePilotSearch algorithm with 1.0 weight for Pilot scoring
- Rename GreedySearch to PurePilotSearch and update implementation
- Modify beam search to use Pilot as primary scorer with 0.7 weight
- Enhance MCTS with Pilot-provided priors in UCT formula and guided
  simulation phase
- Add PilotDecisionCache to prevent repeated LLM calls for same contexts
- Update SearchAlgorithm enum with PurePilot variant and rename others
- Add search_fallback_chain to PipelineContext for ordered algorithm
  execution

BREAKING CHANGE: GreedySearch renamed to PurePilotSearch
BREAKING CHANGE: The config.example.toml file has been removed as it's
no longer needed.

feat(llm): rename summary client to index client for clarity

The summary client has been renamed to index client to better reflect
its purpose during document indexing operations. The old 'summary'
configuration still works as an alias for backward compatibility.

feat(retrieval): add search fallback chain configuration

Add configurable fallback chain for search algorithms that tries
different algorithms ("beam", "mcts", "pure_pilot") in order when
minimum score thresholds aren't met.
…iles

- Remove unnecessary blank lines and trailing spaces
- Consolidate multi-line variable declarations into single lines where appropriate
- Reorder imports to follow standard conventions

refactor(engine): improve code readability in engine implementation

- Format long method chains with proper indentation
- Break down complex expressions into readable blocks
- Clean up error message formatting

refactor(indexer): enhance code formatting in indexing components

- Standardize multi-line function calls and method chaining
- Improve readability of complex operations
- Consolidate redundant blank lines

refactor(retriever): clean up retriever and related modules

- Format long expressions and method calls consistently
- Remove unused imports and declarations
- Improve code organization in TOC processing modules

refactor(llm): streamline LLM executor and pool implementations

- Clean up error messages and string formatting
- Improve readability of conditional statements
- Standardize async method calls

refactor(search): restructure search algorithm implementations

- Format complex calculations and expressions clearly
- Remove unused imports and exports
- Clean up test cases and remove obsolete tests
- Create _graph.bin with document structure containing sample data
- Add nodes with document ID, title, format and top keywords
- Include keyword index mapping for efficient search
- Initialize empty meta.bin for future metadata storage
…tic fallback

Add comprehensive query complexity detection system that uses LLM
classification when available, falling back to heuristic rules.
Supports both English and Chinese queries with improved word counting
for CJK characters. The complexity detector now accepts an optional
LLM client for accurate classification while maintaining backward
compatibility with rule-based detection.

- Add LLM-based complexity detection using pilot's LLM client
- Implement heuristic fallback with enhanced keyword matching
- Support Chinese language complexity indicators
- Add proper CJK character word counting estimation
- Update analyze stage to use LLM-enhanced complexity detection
- Create new pilot complexity module with JSON response parsing
- Include comprehensive test coverage for both approaches
Removed several test functions from the ComplexityDetector that were
no longer needed:
- test_medium_queries
- test_estimate_word_count
- test_no_llm_is_ok

These tests were related to query complexity detection functionality
and word counting utilities that are no longer part of the current
implementation.

refactor(data): clean up example workspace files

Removed binary files _graph.bin and meta.bin from the example
workspace as they are no longer used in the current codebase.
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 13, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
vectorless Ready Ready Preview, Comment Apr 13, 2026 3:59pm

@zTgx zTgx merged commit 730ba92 into main Apr 13, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant