Add llms.txt by KaQuMiQ · Pull Request #393 · miquido/draive

KaQuMiQ · 2025-08-08T07:55:20Z

No description provided.

coderabbitai · 2025-08-08T07:55:27Z

Caution

Review failed

The pull request is closed.

Walkthrough

This change introduces a comprehensive reference document for the Draive framework, a Python 3.12+ functional programming framework for LLM applications. The document details architectural principles, strict typing rules, and essential imports. It distinguishes between state and data models, introduces protocols for service interfaces, and provides usage examples for dependency injection, provider setup, authentication, and multi-provider abstraction. The reference covers context management, the Stage API for AI workflows, evaluation systems, LLM generation, tool integration, multimodal content handling, conversation management, RAG patterns, resource management, utilities, advanced patterns, and error handling. It also includes complete application and testing examples, and enumerates best practices and rules. Additionally, a minor code cleanup was made by removing a generic type parameter from a state instantiation in the instruction refinement helper.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Complexity label: Complex
Rationale: The document is extensive, introducing numerous public entities, protocols, service interfaces, utility functions, and best practices. It covers a wide range of framework features, patterns, and usage scenarios, requiring careful review for accuracy, consistency, and completeness. The scope spans architectural concepts, code examples, and advanced patterns, contributing to a higher estimated review effort. The minor code change in the instruction refinement helper is straightforward and does not add significant review complexity.

Possibly related PRs

Add instruction generation tool #392: Modifies the same file src/draive/helpers/instruction_refinement.py by removing generic type parameters from internal state classes related to instruction refinement, directly connected at the code level.
Update CLAUDE.md #347: Expands and reorganizes documentation for Draive with a focus on architecture and development guidelines, related to the comprehensive reference document added here.
Refeine instructions refinement #328: Extensively restructures instruction refinement logic in the same module by adding new state classes and stages, directly related to the minor code cleanup in instruction refinement within this PR.

Note

🔌 MCP (Model Context Protocol) integration is now available in Early Access!

Pro users can now connect to remote MCP servers under the Integrations page to get reviews and chat conversations that understand additional development context.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fe12906 and 5dfc2ab.

📒 Files selected for processing (3)

llms.txt (1 hunks)
pyproject.toml (1 hunks)
src/draive/helpers/instruction_refinement.py (1 hunks)

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/llms

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 81f49db and ff18979.

📒 Files selected for processing (1)

llms.txt (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📓 Common learnings

Learnt from: KaQuMiQ
PR: miquido/draive#338
File: src/draive/lmm/__init__.py:1-2
Timestamp: 2025-06-16T10:28:07.434Z
Learning: The draive project requires Python 3.12+ as specified in pyproject.toml with "requires-python = ">=3.12"" and uses Python 3.12+ specific features like PEP 695 type aliases and generic syntax extensively throughout the codebase.

Learnt from: KaQuMiQ
PR: miquido/draive#327
File: src/draive/helpers/instruction_preparation.py:28-34
Timestamp: 2025-05-28T17:41:57.460Z
Learning: The draive project uses and requires Python 3.12+, so PEP-695 generic syntax with square brackets (e.g., `def func[T: Type]()`) is valid and should be used instead of the older TypeVar approach.

🪛 LanguageTool

llms.txt

[style] ~41-~41: To elevate your writing, try using a synonym here.
Context: ...Reasoning**: LLM application quality is hard to ensure without systematic evaluation...

(HARD_TO)

llms.txt

coderabbitai

Actionable comments posted: 8

♻️ Duplicate comments (5)

llms.txt (5)

876-913: Remove duplicated “Evaluation System → Built-in Evaluators” section

The block at Lines 1376–1407 duplicates earlier content (Lines 876–913). Keep a single authoritative section to avoid divergence.

-## Evaluation System
-
-### Built-in Evaluators
-```python
-... (lines 1376-1407 content)
-```
+<!-- Removed duplicated “Evaluation System → Built-in Evaluators” section; already covered earlier -->

Also applies to: 1376-1407

1427-1437: Invalid Python: inline class definition inside function arguments

Defining a class directly in the call to ModelGeneration.generate() is invalid. Define the DataModel above, then pass it.

-    evaluation = await ModelGeneration.generate(
-        class CodeEvaluation(DataModel):
-            readability_score: float = Field(description="Score 0-1 for readability")
-            efficiency_score: float = Field(description="Score 0-1 for efficiency")  
-            correctness_score: float = Field(description="Score 0-1 for correctness")
-            overall_score: float = Field(description="Overall score 0-1")
-            feedback: str = Field(description="Detailed feedback")
-        ,
-        instruction=f"Evaluate this {language} code based on: {', '.join(criteria)}",
-        input=code,
-    )
+    class CodeEvaluation(DataModel):
+        readability_score: float = Field(description="Score 0-1 for readability")
+        efficiency_score: float = Field(description="Score 0-1 for efficiency")
+        correctness_score: float = Field(description="Score 0-1 for correctness")
+        overall_score: float = Field(description="Overall score 0-1")
+        feedback: str = Field(description="Detailed feedback")
+
+    evaluation = await ModelGeneration.generate(
+        CodeEvaluation,
+        instruction=f"Evaluate this {language} code based on: {', '.join(criteria)}",
+        input=code,
+    )

1848-1853: Replace unsafe eval in calculator tool with a safe evaluator

Example code should not normalize unsafe patterns.

 @tool  
 async def calculator(expression: str) -> float:
     """Calculate mathematical expressions."""
-    # Safe evaluation logic
-    return eval(expression)  # Use safe_eval in production
+    # Safe evaluation logic (avoid eval)
+    from simpleeval import simple_eval  # lightweight safe evaluator
+    return float(simple_eval(expression))

Optionally add a short warning paragraph above the snippet explaining risks of eval and safe alternatives.

2099-2109: Invalid Python: inline DataModel in generate() call (ArticleSummaryData)

Same issue as above; define class separately and pass it to the call.

-    summary_result = await ModelGeneration.generate(
-        class ArticleSummaryData(DataModel):
-            summary: str = Field(description="Concise summary in 2-3 sentences")
-            key_points: Sequence[str] = Field(description="3-5 key points from the article")
-            reading_time: int = Field(description="Estimated reading time in minutes")
-        ,
-        instruction="Analyze this article and provide a comprehensive summary",
-        input=f"Title: {article.title}\n\nContent: {article.content}",
-    )
+    class ArticleSummaryData(DataModel):
+        summary: str = Field(description="Concise summary in 2-3 sentences")
+        key_points: Sequence[str] = Field(description="3-5 key points from the article")
+        reading_time: int = Field(description="Estimated reading time in minutes")
+
+    summary_result = await ModelGeneration.generate(
+        ArticleSummaryData,
+        instruction="Analyze this article and provide a comprehensive summary",
+        input=f"Title: {article.title}\n\nContent: {article.content}",
+    )

2430-2437: Consolidate repeated “Performance and Scalability Rules” (Section 15 duplicates Section 11)

Merge into a single authoritative list to avoid drift.

-### 15. Performance and Scalability Rules
-...
-**(duplicate of Section 11 content)**
+<!-- Removed duplicate. Consolidate any unique bullets into Section 11 above. -->

Also applies to: 2400-2405

llms.txt

coderabbitai · 2025-08-08T08:40:13Z

llms.txt

+    Stage.completion(
+        "Format as executive summary with action items",
+        instruction="User markdown formatting for the result."
+    )
+)


🧹 Nitpick (assertive)

Typo in instruction text

“User markdown formatting” → “Use markdown formatting”.

- instruction="User markdown formatting for the result." + instruction="Use markdown formatting for the result."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

Stage.completion(

"Format as executive summary with action items",

instruction="User markdown formatting for the result."

)

)

Stage.completion(

"Format as executive summary with action items",

instruction="Use markdown formatting for the result."

)

)

🤖 Prompt for AI Agents

In llms.txt around lines 637 to 641, there is a typo in the instruction text where it says "User markdown formatting for the result." Change "User" to "Use" so the instruction reads "Use markdown formatting for the result."

llms.txt

coderabbitai · 2025-08-08T08:40:13Z

llms.txt

+@tool
+async def analyze_data(data: Sequence[float]) -> DataModel:
+    """Analyze numerical data and return structured results."""
+    class StatisticalAnalysis(DataModel):
+        mean: float
+        median: float
+        std_dev: float
+        count: int
+
+    return StatisticalAnalysis(
+        mean=sum(data) / len(data),
+        median=sorted(data)[len(data) // 2],
+        std_dev=calculate_std_dev(data),
+        count=len(data),
+    )
+```


🛠️ Refactor suggestion

Improve analyze_data: correct median for even-length and remove undefined calculate_std_dev

Use statistics from stdlib; current median calc is wrong for even-length inputs and std dev helper is undefined.

@tool async def analyze_data(data: Sequence[float]) -> DataModel: """Analyze numerical data and return structured results.""" class StatisticalAnalysis(DataModel): mean: float median: float std_dev: float count: int - - return StatisticalAnalysis( - mean=sum(data) / len(data), - median=sorted(data)[len(data) // 2], - std_dev=calculate_std_dev(data), - count=len(data), - ) + from statistics import mean as _mean, median as _median, pstdev as _pstdev + n = len(data) + return StatisticalAnalysis( + mean=_mean(data) if n else 0.0, + median=_median(data) if n else 0.0, + std_dev=_pstdev(data) if n > 1 else 0.0, + count=n, + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@tool

async def analyze_data(data: Sequence[float]) -> DataModel:

"""Analyze numerical data and return structured results."""

class StatisticalAnalysis(DataModel):

mean: float

median: float

std_dev: float

count: int

return StatisticalAnalysis(

mean=sum(data) / len(data),

median=sorted(data)[len(data) // 2],

std_dev=calculate_std_dev(data),

count=len(data),

)

```

@tool

async def analyze_data(data: Sequence[float]) -> DataModel:

"""Analyze numerical data and return structured results."""

class StatisticalAnalysis(DataModel):

mean: float

median: float

std_dev: float

count: int

from statistics import mean as _mean, median as _median, pstdev as _pstdev

n = len(data)

return StatisticalAnalysis(

mean=_mean(data) if n else 0.0,

median=_median(data) if n else 0.0,

std_dev=_pstdev(data) if n > 1 else 0.0,

count=n,

)

🤖 Prompt for AI Agents

In llms.txt around lines 1134 to 1149, the median calculation is incorrect for even-length data sequences and the function calculate_std_dev is undefined. Replace the manual median calculation with the median function from the Python statistics module to handle both even and odd lengths correctly. Also, import and use statistics.stdev for standard deviation instead of the undefined calculate_std_dev function. Ensure to add the necessary import statement for the statistics module at the top of the file.

llms.txt

coderabbitai bot reviewed Aug 8, 2025

View reviewed changes

llms.txt Show resolved Hide resolved

llms.txt Outdated Show resolved Hide resolved

llms.txt Outdated Show resolved Hide resolved

llms.txt Outdated Show resolved Hide resolved

KaQuMiQ force-pushed the feature/llms branch from ff18979 to 1e67275 Compare August 8, 2025 08:33

coderabbitai bot reviewed Aug 8, 2025

View reviewed changes

KaQuMiQ force-pushed the feature/llms branch 2 times, most recently from fe12906 to 1ce3733 Compare August 8, 2025 09:54

Add llms.txt

5dfc2ab

KaQuMiQ force-pushed the feature/llms branch from 1ce3733 to 5dfc2ab Compare August 8, 2025 10:00

KaQuMiQ merged commit 941b6df into main Aug 8, 2025
4 checks passed

KaQuMiQ deleted the feature/llms branch August 8, 2025 10:00

This was referenced Oct 2, 2025

Cleanup and lint docs #438

Merged

Update dependencies #440

Merged

coderabbitai bot mentioned this pull request Jan 30, 2026

Add S3 resources listing #488

Merged

Comments

Conversation

KaQuMiQ commented Aug 8, 2025

Uh oh!

coderabbitai bot commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Estimated code review effort

Possibly related PRs

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Aug 8, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)