Conversation
Change the default SummaryStrategy from Selective with min_tokens: 100 and branch_only: true to Full with default configuration.
Add enable_synonym_expansion configuration option to ReasoningIndexConfig that when enabled calls the LLM during indexing to generate synonym terms for keywords. This improves recall for queries using different wording than the original documents. The feature: - Generates up to 5 synonyms per keyword using LLM completion - Applies reduced weight (0.6x) to synonym matches to reflect indirectness - Limits expansion to top-ranked keywords based on entry count - Includes error handling and logging for LLM failures - Adds synonym count to indexing metrics and metadata Also update max_tokens_per_node from 8000 to 4000 in SplitConfig default. feat(search): include cross-references in graph traversal Replace tree.children() with new tree.children_with_refs() method that includes resolved cross-reference targets in search traversal. This allows search algorithms (Beam, Greedy, MCTS) to follow document cross-references during exploration, improving navigation through linked content. The children_with_refs method: - Combines direct children with referenced target nodes - Deduplicates node IDs to prevent duplicate visits - Maintains backward compatibility for existing functionality
- Add enable_synonym_expansion option to IndexOptions for expanding keywords with LLM-generated synonyms during indexing to improve recall for differently-worded queries - Implement cross-reference resolution in enrich stage to extract and resolve in-document references like "see Section 2.1" or "Appendix G" to actual node IDs in the document tree - Add DocumentTree::set_references method for managing node references and children_with_refs for including referenced nodes - Include resolved reference count in indexing metrics
…ting - Add CrossDocumentStrategy and CrossDocumentConfig to support cross-document retrieval with graph-based boosting capabilities - Replace tree.children() with tree.children_with_refs() in ToCNavigator to include reference nodes in tree traversal - Implement ForceCrossDocument strategy preference handling in SearchStage with proper graph attachment when available - Export new cross-document types in strategy module
… features - Add architecture documentation explaining the end-to-end pipeline with detailed indexing stages (Parse, Build, Validate, Split, Enhance, Enrich, Reasoning Index, Optimize) and retrieval phases (Analyze, Plan, Search, Evaluate) - Add examples documentation covering quick query, multi-document retrieval, batch indexing, and cross-document graph usage in both Python and Rust - Add feature documentation for cross-document graph with relationship building and score boosting, PDF support with page-level tracking, summary strategies (Full, Selective, Lazy), and synonym expansion with LLM-generated alternatives - Add indexing documentation covering configuration options, incremental indexing with content fingerprinting, and pipeline overview with detailed stage descriptions - Add retrieval documentation explaining cross-reference navigation for resolving internal document references and overview of search algorithms and strategies - Update introduction to reflect enhanced capabilities including cross-reference navigation, synonym expansion, multi-algorithm search, and cross-document graph features
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.