Skip to content

Comments

Add llms.txt#393

Merged
KaQuMiQ merged 1 commit intomainfrom
feature/llms
Aug 8, 2025
Merged

Add llms.txt#393
KaQuMiQ merged 1 commit intomainfrom
feature/llms

Conversation

@KaQuMiQ
Copy link
Collaborator

@KaQuMiQ KaQuMiQ commented Aug 8, 2025

No description provided.

@coderabbitai
Copy link

coderabbitai bot commented Aug 8, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

This change introduces a comprehensive reference document for the Draive framework, a Python 3.12+ functional programming framework for LLM applications. The document details architectural principles, strict typing rules, and essential imports. It distinguishes between state and data models, introduces protocols for service interfaces, and provides usage examples for dependency injection, provider setup, authentication, and multi-provider abstraction. The reference covers context management, the Stage API for AI workflows, evaluation systems, LLM generation, tool integration, multimodal content handling, conversation management, RAG patterns, resource management, utilities, advanced patterns, and error handling. It also includes complete application and testing examples, and enumerates best practices and rules. Additionally, a minor code cleanup was made by removing a generic type parameter from a state instantiation in the instruction refinement helper.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

  • Complexity label: Complex
  • Rationale: The document is extensive, introducing numerous public entities, protocols, service interfaces, utility functions, and best practices. It covers a wide range of framework features, patterns, and usage scenarios, requiring careful review for accuracy, consistency, and completeness. The scope spans architectural concepts, code examples, and advanced patterns, contributing to a higher estimated review effort. The minor code change in the instruction refinement helper is straightforward and does not add significant review complexity.

Possibly related PRs

  • Add instruction generation tool #392: Modifies the same file src/draive/helpers/instruction_refinement.py by removing generic type parameters from internal state classes related to instruction refinement, directly connected at the code level.
  • Update CLAUDE.md #347: Expands and reorganizes documentation for Draive with a focus on architecture and development guidelines, related to the comprehensive reference document added here.
  • Refeine instructions refinement #328: Extensively restructures instruction refinement logic in the same module by adding new state classes and stages, directly related to the minor code cleanup in instruction refinement within this PR.

Note

🔌 MCP (Model Context Protocol) integration is now available in Early Access!

Pro users can now connect to remote MCP servers under the Integrations page to get reviews and chat conversations that understand additional development context.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fe12906 and 5dfc2ab.

📒 Files selected for processing (3)
  • llms.txt (1 hunks)
  • pyproject.toml (1 hunks)
  • src/draive/helpers/instruction_refinement.py (1 hunks)
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/llms

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 81f49db and ff18979.

📒 Files selected for processing (1)
  • llms.txt (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: KaQuMiQ
PR: miquido/draive#338
File: src/draive/lmm/__init__.py:1-2
Timestamp: 2025-06-16T10:28:07.434Z
Learning: The draive project requires Python 3.12+ as specified in pyproject.toml with "requires-python = ">=3.12"" and uses Python 3.12+ specific features like PEP 695 type aliases and generic syntax extensively throughout the codebase.
Learnt from: KaQuMiQ
PR: miquido/draive#327
File: src/draive/helpers/instruction_preparation.py:28-34
Timestamp: 2025-05-28T17:41:57.460Z
Learning: The draive project uses and requires Python 3.12+, so PEP-695 generic syntax with square brackets (e.g., `def func[T: Type]()`) is valid and should be used instead of the older TypeVar approach.
🪛 LanguageTool
llms.txt

[style] ~41-~41: To elevate your writing, try using a synonym here.
Context: ...Reasoning**: LLM application quality is hard to ensure without systematic evaluation...

(HARD_TO)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

♻️ Duplicate comments (5)
llms.txt (5)

876-913: Remove duplicated “Evaluation System → Built-in Evaluators” section

The block at Lines 1376–1407 duplicates earlier content (Lines 876–913). Keep a single authoritative section to avoid divergence.

-## Evaluation System
-
-### Built-in Evaluators
-```python
-... (lines 1376-1407 content)
-```
+<!-- Removed duplicated “Evaluation System → Built-in Evaluators” section; already covered earlier -->

Also applies to: 1376-1407


1427-1437: Invalid Python: inline class definition inside function arguments

Defining a class directly in the call to ModelGeneration.generate() is invalid. Define the DataModel above, then pass it.

-    evaluation = await ModelGeneration.generate(
-        class CodeEvaluation(DataModel):
-            readability_score: float = Field(description="Score 0-1 for readability")
-            efficiency_score: float = Field(description="Score 0-1 for efficiency")  
-            correctness_score: float = Field(description="Score 0-1 for correctness")
-            overall_score: float = Field(description="Overall score 0-1")
-            feedback: str = Field(description="Detailed feedback")
-        ,
-        instruction=f"Evaluate this {language} code based on: {', '.join(criteria)}",
-        input=code,
-    )
+    class CodeEvaluation(DataModel):
+        readability_score: float = Field(description="Score 0-1 for readability")
+        efficiency_score: float = Field(description="Score 0-1 for efficiency")
+        correctness_score: float = Field(description="Score 0-1 for correctness")
+        overall_score: float = Field(description="Overall score 0-1")
+        feedback: str = Field(description="Detailed feedback")
+
+    evaluation = await ModelGeneration.generate(
+        CodeEvaluation,
+        instruction=f"Evaluate this {language} code based on: {', '.join(criteria)}",
+        input=code,
+    )

1848-1853: Replace unsafe eval in calculator tool with a safe evaluator

Example code should not normalize unsafe patterns.

 @tool  
 async def calculator(expression: str) -> float:
     """Calculate mathematical expressions."""
-    # Safe evaluation logic
-    return eval(expression)  # Use safe_eval in production
+    # Safe evaluation logic (avoid eval)
+    from simpleeval import simple_eval  # lightweight safe evaluator
+    return float(simple_eval(expression))

Optionally add a short warning paragraph above the snippet explaining risks of eval and safe alternatives.


2099-2109: Invalid Python: inline DataModel in generate() call (ArticleSummaryData)

Same issue as above; define class separately and pass it to the call.

-    summary_result = await ModelGeneration.generate(
-        class ArticleSummaryData(DataModel):
-            summary: str = Field(description="Concise summary in 2-3 sentences")
-            key_points: Sequence[str] = Field(description="3-5 key points from the article")
-            reading_time: int = Field(description="Estimated reading time in minutes")
-        ,
-        instruction="Analyze this article and provide a comprehensive summary",
-        input=f"Title: {article.title}\n\nContent: {article.content}",
-    )
+    class ArticleSummaryData(DataModel):
+        summary: str = Field(description="Concise summary in 2-3 sentences")
+        key_points: Sequence[str] = Field(description="3-5 key points from the article")
+        reading_time: int = Field(description="Estimated reading time in minutes")
+
+    summary_result = await ModelGeneration.generate(
+        ArticleSummaryData,
+        instruction="Analyze this article and provide a comprehensive summary",
+        input=f"Title: {article.title}\n\nContent: {article.content}",
+    )

2430-2437: Consolidate repeated “Performance and Scalability Rules” (Section 15 duplicates Section 11)

Merge into a single authoritative list to avoid drift.

-### 15. Performance and Scalability Rules
-...
-**(duplicate of Section 11 content)**
+<!-- Removed duplicate. Consolidate any unique bullets into Section 11 above. -->

Also applies to: 2400-2405

Comment on lines +637 to +510
Stage.completion(
"Format as executive summary with action items",
instruction="User markdown formatting for the result."
)
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Typo in instruction text

“User markdown formatting” → “Use markdown formatting”.

-        instruction="User markdown formatting for the result."
+        instruction="Use markdown formatting for the result."
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Stage.completion(
"Format as executive summary with action items",
instruction="User markdown formatting for the result."
)
)
Stage.completion(
"Format as executive summary with action items",
instruction="Use markdown formatting for the result."
)
)
🤖 Prompt for AI Agents
In llms.txt around lines 637 to 641, there is a typo in the instruction text
where it says "User markdown formatting for the result." Change "User" to "Use"
so the instruction reads "Use markdown formatting for the result."

llms.txt Outdated
Comment on lines 1134 to 1149
@tool
async def analyze_data(data: Sequence[float]) -> DataModel:
"""Analyze numerical data and return structured results."""
class StatisticalAnalysis(DataModel):
mean: float
median: float
std_dev: float
count: int

return StatisticalAnalysis(
mean=sum(data) / len(data),
median=sorted(data)[len(data) // 2],
std_dev=calculate_std_dev(data),
count=len(data),
)
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve analyze_data: correct median for even-length and remove undefined calculate_std_dev

Use statistics from stdlib; current median calc is wrong for even-length inputs and std dev helper is undefined.

 @tool
 async def analyze_data(data: Sequence[float]) -> DataModel:
     """Analyze numerical data and return structured results."""
     class StatisticalAnalysis(DataModel):
         mean: float
         median: float
         std_dev: float
         count: int
-    
-    return StatisticalAnalysis(
-        mean=sum(data) / len(data),
-        median=sorted(data)[len(data) // 2],
-        std_dev=calculate_std_dev(data),
-        count=len(data),
-    )
+    from statistics import mean as _mean, median as _median, pstdev as _pstdev
+    n = len(data)
+    return StatisticalAnalysis(
+        mean=_mean(data) if n else 0.0,
+        median=_median(data) if n else 0.0,
+        std_dev=_pstdev(data) if n > 1 else 0.0,
+        count=n,
+    )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@tool
async def analyze_data(data: Sequence[float]) -> DataModel:
"""Analyze numerical data and return structured results."""
class StatisticalAnalysis(DataModel):
mean: float
median: float
std_dev: float
count: int
return StatisticalAnalysis(
mean=sum(data) / len(data),
median=sorted(data)[len(data) // 2],
std_dev=calculate_std_dev(data),
count=len(data),
)
```
@tool
async def analyze_data(data: Sequence[float]) -> DataModel:
"""Analyze numerical data and return structured results."""
class StatisticalAnalysis(DataModel):
mean: float
median: float
std_dev: float
count: int
from statistics import mean as _mean, median as _median, pstdev as _pstdev
n = len(data)
return StatisticalAnalysis(
mean=_mean(data) if n else 0.0,
median=_median(data) if n else 0.0,
std_dev=_pstdev(data) if n > 1 else 0.0,
count=n,
)
🤖 Prompt for AI Agents
In llms.txt around lines 1134 to 1149, the median calculation is incorrect for
even-length data sequences and the function calculate_std_dev is undefined.
Replace the manual median calculation with the median function from the Python
statistics module to handle both even and odd lengths correctly. Also, import
and use statistics.stdev for standard deviation instead of the undefined
calculate_std_dev function. Ensure to add the necessary import statement for the
statistics module at the top of the file.

@KaQuMiQ KaQuMiQ force-pushed the feature/llms branch 2 times, most recently from fe12906 to 1ce3733 Compare August 8, 2025 09:54
@KaQuMiQ KaQuMiQ merged commit 941b6df into main Aug 8, 2025
4 checks passed
@KaQuMiQ KaQuMiQ deleted the feature/llms branch August 8, 2025 10:00
This was referenced Oct 2, 2025
@coderabbitai coderabbitai bot mentioned this pull request Jan 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant