Skip to content

Dev#67

Merged
zTgx merged 5 commits intomainfrom
dev
Apr 14, 2026
Merged

Dev#67
zTgx merged 5 commits intomainfrom
dev

Conversation

@zTgx
Copy link
Copy Markdown
Contributor

@zTgx zTgx commented Apr 14, 2026

No description provided.

zTgx added 5 commits April 14, 2026 17:16
Change the default SummaryStrategy from Selective with min_tokens: 100
and branch_only: true to Full with default configuration.
Add enable_synonym_expansion configuration option to ReasoningIndexConfig
that when enabled calls the LLM during indexing to generate synonym terms
for keywords. This improves recall for queries using different wording
than the original documents.

The feature:
- Generates up to 5 synonyms per keyword using LLM completion
- Applies reduced weight (0.6x) to synonym matches to reflect indirectness
- Limits expansion to top-ranked keywords based on entry count
- Includes error handling and logging for LLM failures
- Adds synonym count to indexing metrics and metadata

Also update max_tokens_per_node from 8000 to 4000 in SplitConfig default.

feat(search): include cross-references in graph traversal

Replace tree.children() with new tree.children_with_refs() method that
includes resolved cross-reference targets in search traversal. This allows
search algorithms (Beam, Greedy, MCTS) to follow document cross-references
during exploration, improving navigation through linked content.

The children_with_refs method:
- Combines direct children with referenced target nodes
- Deduplicates node IDs to prevent duplicate visits
- Maintains backward compatibility for existing functionality
- Add enable_synonym_expansion option to IndexOptions for expanding
  keywords with LLM-generated synonyms during indexing to improve
  recall for differently-worded queries
- Implement cross-reference resolution in enrich stage to extract
  and resolve in-document references like "see Section 2.1" or
  "Appendix G" to actual node IDs in the document tree
- Add DocumentTree::set_references method for managing node
  references and children_with_refs for including referenced nodes
- Include resolved reference count in indexing metrics
…ting

- Add CrossDocumentStrategy and CrossDocumentConfig to support
  cross-document retrieval with graph-based boosting capabilities
- Replace tree.children() with tree.children_with_refs() in
  ToCNavigator to include reference nodes in tree traversal
- Implement ForceCrossDocument strategy preference handling in
  SearchStage with proper graph attachment when available
- Export new cross-document types in strategy module
… features

- Add architecture documentation explaining the end-to-end pipeline with
  detailed indexing stages (Parse, Build, Validate, Split, Enhance, Enrich,
  Reasoning Index, Optimize) and retrieval phases (Analyze, Plan, Search,
  Evaluate)

- Add examples documentation covering quick query, multi-document
  retrieval, batch indexing, and cross-document graph usage in both
  Python and Rust

- Add feature documentation for cross-document graph with relationship
  building and score boosting, PDF support with page-level tracking,
  summary strategies (Full, Selective, Lazy), and synonym expansion
  with LLM-generated alternatives

- Add indexing documentation covering configuration options,
  incremental indexing with content fingerprinting, and pipeline
  overview with detailed stage descriptions

- Add retrieval documentation explaining cross-reference navigation
  for resolving internal document references and overview of search
  algorithms and strategies

- Update introduction to reflect enhanced capabilities including
  cross-reference navigation, synonym expansion, multi-algorithm
  search, and cross-document graph features
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 14, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
vectorless Ready Ready Preview, Comment Apr 14, 2026 11:13pm

@zTgx zTgx merged commit 727b849 into main Apr 14, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant