Skip to content

Comments

Update telemetry and error handling#460

Merged
KaQuMiQ merged 1 commit intomainfrom
feature/metrics
Nov 4, 2025
Merged

Update telemetry and error handling#460
KaQuMiQ merged 1 commit intomainfrom
feature/metrics

Conversation

@KaQuMiQ
Copy link
Collaborator

@KaQuMiQ KaQuMiQ commented Nov 4, 2025

No description provided.

@coderabbitai
Copy link

coderabbitai bot commented Nov 4, 2025

Warning

Rate limit exceeded

@KaQuMiQ has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 13 minutes and 42 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between e086ee2 and 8b65ab5.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (31)
  • pyproject.toml (2 hunks)
  • src/draive/cohere/embedding.py (4 hunks)
  • src/draive/conversation/completion/default.py (2 hunks)
  • src/draive/conversation/completion/state.py (2 hunks)
  • src/draive/conversation/realtime/state.py (2 hunks)
  • src/draive/evaluation/scenario.py (2 hunks)
  • src/draive/gemini/embedding.py (2 hunks)
  • src/draive/generation/audio/default.py (1 hunks)
  • src/draive/generation/audio/state.py (2 hunks)
  • src/draive/generation/image/default.py (1 hunks)
  • src/draive/generation/image/state.py (2 hunks)
  • src/draive/generation/model/default.py (1 hunks)
  • src/draive/generation/model/state.py (2 hunks)
  • src/draive/generation/text/default.py (1 hunks)
  • src/draive/generation/text/state.py (2 hunks)
  • src/draive/helpers/volatile_vector_index.py (4 hunks)
  • src/draive/mistral/completions.py (2 hunks)
  • src/draive/mistral/embedding.py (2 hunks)
  • src/draive/models/tools/function.py (4 hunks)
  • src/draive/ollama/embedding.py (2 hunks)
  • src/draive/openai/embedding.py (2 hunks)
  • src/draive/openai/realtime.py (1 hunks)
  • src/draive/postgres/configuration.py (5 hunks)
  • src/draive/postgres/memory.py (4 hunks)
  • src/draive/postgres/templates.py (5 hunks)
  • src/draive/postgres/vector_index.py (7 hunks)
  • src/draive/stages/stage.py (6 hunks)
  • src/draive/utils/memory.py (4 hunks)
  • src/draive/utils/vector_index.py (7 hunks)
  • src/draive/vllm/embedding.py (2 hunks)
  • src/draive/vllm/messages.py (1 hunks)

Walkthrough

This PR increments project version to 0.91.2 and updates haiway to 0.37.6 in pyproject.toml. It introduces runtime observability (ctx.scope and ctx.record with ObservabilityLevel) across many modules (conversation, realtime, generation, stages, evaluation, memory, vector helpers, tools, embeddings, providers). Conversation completion/streaming were refactored to inline memory recall/context construction and centralize remember error handling. Several vector index APIs and callers changed type hints from Iterable to Collection and removed as_tuple usage. Embedding paths add batching metrics and early-empty-input guards. Minor serialization/logging tweaks were applied in multiple places.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Areas that may need extra attention:
    • Conversation completion and streaming refactor: src/draive/conversation/completion/*
    • Realtime preparation and memory branching: src/draive/conversation/realtime/state.py
    • Observability additions and ctx.scope usage across generation/state, stages, model/state and embeddings
    • Vector index signature/type changes and removal of as_tuple: helpers/postgres/utils vector index files
    • Tool-call error handling changes: src/draive/models/tools/function.py

Possibly related PRs

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Description check ❓ Inconclusive No description was provided by the author, making it impossible to assess whether a description relates to the changeset. Add a pull request description explaining the telemetry and error handling updates, their purpose, and scope of changes across the codebase.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Update telemetry and error handling' accurately reflects the primary changes across the pull request, which focus on adding observability instrumentation, metrics logging, and error handling improvements.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/draive/evaluation/scenario.py (1)

402-412: Move ctx.record inside the scope block to match the pattern in evaluator.py.

The metric recording is currently placed outside the ctx.scope block (lines 402-412), inconsistent with evaluator.py where ctx.record is indented inside the scope block. Move the entire ctx.record call inside the scope before the return result statement to maintain consistent scoping and ensure metrics are recorded within the active context.

async with ctx.scope(f"evaluator.scenario.{self.name}", *self._state):
    result: EvaluatorScenarioResult = await self._evaluate(...)
    
    ctx.record(
        ObservabilityLevel.INFO,
        metric=f"evaluator.scenario.{result.scenario}.performance",
        value=result.performance,
        unit="%",
        kind="histogram",
        attributes={
            "passed": result.passed,
            "evaluators": [result.evaluator for result in result.results],
        },
    )
    return result
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fa217c9 and fe72609.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (21)
  • pyproject.toml (2 hunks)
  • src/draive/conversation/completion/default.py (2 hunks)
  • src/draive/conversation/completion/state.py (1 hunks)
  • src/draive/conversation/realtime/state.py (1 hunks)
  • src/draive/evaluation/scenario.py (2 hunks)
  • src/draive/generation/audio/default.py (1 hunks)
  • src/draive/generation/audio/state.py (2 hunks)
  • src/draive/generation/image/default.py (1 hunks)
  • src/draive/generation/image/state.py (2 hunks)
  • src/draive/generation/model/default.py (1 hunks)
  • src/draive/generation/model/state.py (2 hunks)
  • src/draive/generation/text/default.py (1 hunks)
  • src/draive/generation/text/state.py (2 hunks)
  • src/draive/helpers/volatile_vector_index.py (2 hunks)
  • src/draive/models/tools/function.py (4 hunks)
  • src/draive/ollama/embedding.py (1 hunks)
  • src/draive/openai/realtime.py (1 hunks)
  • src/draive/postgres/vector_index.py (5 hunks)
  • src/draive/stages/stage.py (5 hunks)
  • src/draive/utils/memory.py (4 hunks)
  • src/draive/utils/vector_index.py (7 hunks)
🧰 Additional context used
📓 Path-based instructions (9)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use Python 3.12+ features and syntax across the codebase
Format code exclusively with Ruff (make format); do not use other formatters
Skip module-level docstrings

Files:

  • src/draive/utils/memory.py
  • src/draive/generation/audio/state.py
  • src/draive/ollama/embedding.py
  • src/draive/models/tools/function.py
  • src/draive/generation/text/state.py
  • src/draive/conversation/completion/state.py
  • src/draive/conversation/realtime/state.py
  • src/draive/generation/image/state.py
  • src/draive/stages/stage.py
  • src/draive/openai/realtime.py
  • src/draive/generation/image/default.py
  • src/draive/generation/model/state.py
  • src/draive/utils/vector_index.py
  • src/draive/postgres/vector_index.py
  • src/draive/evaluation/scenario.py
  • src/draive/generation/audio/default.py
  • src/draive/generation/model/default.py
  • src/draive/helpers/volatile_vector_index.py
  • src/draive/generation/text/default.py
  • src/draive/conversation/completion/default.py
src/draive/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/draive/**/*.py: Import Haiway symbols directly (from haiway import State, ctx)
Use ctx.scope(...) to bind scoped Disposables and active State; avoid global state
Route all logs through ctx.log_debug/info/warn/error; do not use print
Use latest, most strict typing syntax (Python 3.12+), with strict typing only for public APIs
Avoid loose Any except at explicit third‑party boundaries
Prefer explicit attribute access with static types; avoid dynamic getattr except at narrow boundaries
Prefer Mapping/Sequence/Iterable in public types over dict/list/set
Use final where applicable; avoid inheritance and prefer composition
Use precise unions (|) and narrow with match/isinstance; avoid cast unless provably safe and localized
Model immutable data/config and facades with haiway.State; provide ergonomic classmethods like .of(...)
Avoid in-place mutation; use State.updated(...) or functional builders to produce new instances
Access active state via haiway.ctx inside async scopes (ctx.scope(...))
Use @statemethod for public state methods that dispatch on the active instance
Log around generation calls, tool dispatch, and provider requests/responses without leaking secrets; prefer structured/concise messages
Add metrics via ctx.record where applicable
All I/O is async; keep boundaries async and use ctx.spawn for detached tasks
Use structured concurrency and valid coroutine usage; rely on haiway/asyncio; avoid custom threading
Construct multimodal content with MultimodalContent.of(...) and compose blocks explicitly
Use ResourceContent/ResourceReference for media/data blobs
Wrap custom types/data within ArtifactContent; use hidden when needed
Add NumPy-style docstrings for public symbols with Parameters/Returns/Raises and rationale when non-obvious
Avoid docstrings on internal helpers; keep names self-explanatory
Keep docstrings high-quality; mkdocstrings pulls them into API reference
Never log secrets or full request bodies containing keys/tokens

Files:

  • src/draive/utils/memory.py
  • src/draive/generation/audio/state.py
  • src/draive/ollama/embedding.py
  • src/draive/models/tools/function.py
  • src/draive/generation/text/state.py
  • src/draive/conversation/completion/state.py
  • src/draive/conversation/realtime/state.py
  • src/draive/generation/image/state.py
  • src/draive/stages/stage.py
  • src/draive/openai/realtime.py
  • src/draive/generation/image/default.py
  • src/draive/generation/model/state.py
  • src/draive/utils/vector_index.py
  • src/draive/postgres/vector_index.py
  • src/draive/evaluation/scenario.py
  • src/draive/generation/audio/default.py
  • src/draive/generation/model/default.py
  • src/draive/helpers/volatile_vector_index.py
  • src/draive/generation/text/default.py
  • src/draive/conversation/completion/default.py
src/draive/utils/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Keep utilities (e.g., Memory, VectorIndex) under draive/utils/

Files:

  • src/draive/utils/memory.py
  • src/draive/utils/vector_index.py
src/draive/{openai,anthropic,mistral,gemini,vllm,ollama,bedrock,cohere}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/draive/{openai,anthropic,mistral,gemini,vllm,ollama,bedrock,cohere}/**/*.py: Provider-specific feature modules live under their respective provider directories
Translate provider/SDK errors into typed exceptions; do not raise bare Exception and preserve context
Use environment variables for credentials and resolve via helper functions like getenv_str

Files:

  • src/draive/ollama/embedding.py
  • src/draive/openai/realtime.py
src/draive/models/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Keep core abstractions (GenerativeModel, tools, instructions) under draive/models/

Files:

  • src/draive/models/tools/function.py
src/draive/conversation/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Implement higher-level chat/realtime conversations under draive/conversation/

Files:

  • src/draive/conversation/completion/state.py
  • src/draive/conversation/realtime/state.py
  • src/draive/conversation/completion/default.py
src/draive/stages/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Implement pipeline stage abstractions and helpers under draive/stages/

Files:

  • src/draive/stages/stage.py
{pyproject.toml,pyrightconfig.json}

📄 CodeRabbit inference engine (AGENTS.md)

Use Ruff, Bandit, and Pyright (strict) via make lint

Files:

  • pyproject.toml
src/draive/{httpx,mcp,postgres,opentelemetry}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Place integrations under draive/httpx, draive/mcp, draive/postgres, draive/opentelemetry

Files:

  • src/draive/postgres/vector_index.py
🧠 Learnings (18)
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Add metrics via ctx.record where applicable

Applied to files:

  • src/draive/utils/memory.py
  • src/draive/ollama/embedding.py
  • src/draive/generation/text/state.py
  • src/draive/openai/realtime.py
  • src/draive/generation/model/state.py
  • src/draive/evaluation/scenario.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Access active state via haiway.ctx inside async scopes (ctx.scope(...))

Applied to files:

  • src/draive/utils/memory.py
  • src/draive/generation/audio/state.py
  • src/draive/models/tools/function.py
  • src/draive/generation/text/state.py
  • src/draive/conversation/completion/state.py
  • src/draive/generation/image/state.py
  • src/draive/generation/image/default.py
  • src/draive/generation/model/state.py
  • src/draive/generation/audio/default.py
  • src/draive/generation/text/default.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Import Haiway symbols directly (from haiway import State, ctx)

Applied to files:

  • src/draive/utils/memory.py
  • src/draive/generation/audio/state.py
  • src/draive/generation/image/state.py
  • src/draive/generation/image/default.py
  • src/draive/generation/model/state.py
  • src/draive/generation/audio/default.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Use ctx.scope(...) to bind scoped Disposables and active State; avoid global state

Applied to files:

  • src/draive/utils/memory.py
  • src/draive/generation/audio/state.py
  • src/draive/generation/text/state.py
  • src/draive/conversation/completion/state.py
  • src/draive/generation/image/state.py
  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Model immutable data/config and facades with haiway.State; provide ergonomic classmethods like .of(...)

Applied to files:

  • src/draive/utils/memory.py
  • src/draive/generation/audio/state.py
  • src/draive/models/tools/function.py
  • src/draive/generation/image/state.py
  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to tests/**/*.py : Prefer scoping with ctx.scope(...) in async tests and bind required State instances explicitly

Applied to files:

  • src/draive/generation/audio/state.py
  • src/draive/generation/text/state.py
  • src/draive/conversation/completion/state.py
  • src/draive/generation/image/state.py
  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Use statemethod for public state methods that dispatch on the active instance

Applied to files:

  • src/draive/generation/audio/state.py
  • src/draive/generation/image/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Log around generation calls, tool dispatch, and provider requests/responses without leaking secrets; prefer structured/concise messages

Applied to files:

  • src/draive/models/tools/function.py
  • src/draive/generation/text/state.py
  • src/draive/generation/model/state.py
  • src/draive/generation/model/default.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Use structured concurrency and valid coroutine usage; rely on haiway/asyncio; avoid custom threading

Applied to files:

  • src/draive/models/tools/function.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/generation/**/*.{py} : Place typed generation facades and wiring (state.py, types.py, default.py) under draive/generation/

Applied to files:

  • src/draive/generation/text/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/conversation/**/*.py : Implement higher-level chat/realtime conversations under draive/conversation/

Applied to files:

  • src/draive/conversation/completion/state.py
  • src/draive/conversation/realtime/state.py
  • src/draive/conversation/completion/default.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : All I/O is async; keep boundaries async and use ctx.spawn for detached tasks

Applied to files:

  • src/draive/generation/image/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/stages/**/*.py : Implement pipeline stage abstractions and helpers under draive/stages/

Applied to files:

  • src/draive/stages/stage.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Prefer Mapping/Sequence/Iterable in public types over dict/list/set

Applied to files:

  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/embedding/**/*.py : Keep vector ops, similarity, indexing, and typed embedding states under draive/embedding/

Applied to files:

  • src/draive/utils/vector_index.py
📚 Learning: 2025-06-16T10:28:07.434Z
Learnt from: KaQuMiQ
Repo: miquido/draive PR: 338
File: src/draive/lmm/__init__.py:1-2
Timestamp: 2025-06-16T10:28:07.434Z
Learning: The draive project requires Python 3.12+ as specified in pyproject.toml with "requires-python = ">=3.12"" and uses Python 3.12+ specific features like PEP 695 type aliases and generic syntax extensively throughout the codebase.

Applied to files:

  • pyproject.toml
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/models/**/*.py : Keep core abstractions (GenerativeModel, tools, instructions) under draive/models/

Applied to files:

  • src/draive/generation/model/default.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/__init__.py : Update src/draive/__init__.py exports when API surface changes

Applied to files:

  • src/draive/generation/text/default.py
🧬 Code graph analysis (13)
src/draive/generation/audio/state.py (2)
src/draive/multimodal/templates/types.py (1)
  • Template (39-139)
src/draive/multimodal/templates/repository.py (6)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
src/draive/ollama/embedding.py (1)
src/draive/parameters/model.py (1)
  • to_mapping (462-478)
src/draive/generation/text/state.py (3)
src/draive/multimodal/templates/types.py (3)
  • Template (39-139)
  • of (56-84)
  • of (162-194)
src/draive/multimodal/templates/repository.py (6)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
src/draive/models/tools/toolbox.py (1)
  • Toolbox (20-467)
src/draive/conversation/completion/state.py (5)
src/draive/conversation/types.py (5)
  • ConversationMessage (139-200)
  • user (157-172)
  • of (33-52)
  • of (78-89)
  • of (117-130)
src/draive/multimodal/templates/repository.py (6)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
src/draive/utils/memory.py (5)
  • constant (54-66)
  • Memory (52-136)
  • recall (70-73)
  • recall (76-79)
  • recall (82-87)
src/draive/models/types.py (3)
  • ModelMemoryRecall (719-766)
  • ModelInput (429-501)
  • ModelOutput (573-672)
src/draive/multimodal/templates/types.py (2)
  • of (56-84)
  • Template (39-139)
src/draive/conversation/realtime/state.py (5)
src/draive/utils/memory.py (5)
  • Memory (52-136)
  • recall (70-73)
  • recall (76-79)
  • recall (82-87)
  • constant (54-66)
src/draive/models/types.py (3)
  • ModelMemoryRecall (719-766)
  • ModelInput (429-501)
  • ModelOutput (573-672)
src/draive/multimodal/templates/types.py (2)
  • of (56-84)
  • Template (39-139)
src/draive/multimodal/templates/repository.py (3)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
src/draive/models/tools/toolbox.py (1)
  • Toolbox (20-467)
src/draive/generation/image/state.py (2)
src/draive/multimodal/templates/types.py (1)
  • Template (39-139)
src/draive/multimodal/templates/repository.py (6)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
src/draive/stages/stage.py (1)
src/draive/multimodal/templates/types.py (1)
  • Template (39-139)
src/draive/generation/image/default.py (3)
src/draive/models/types.py (4)
  • ModelOutput (573-672)
  • ModelInput (429-501)
  • content (474-478)
  • content (618-622)
src/draive/models/generative.py (5)
  • completion (62-71)
  • completion (74-83)
  • completion (87-96)
  • completion (99-108)
  • completion (111-182)
src/draive/multimodal/content.py (3)
  • of (41-65)
  • of (618-646)
  • images (143-158)
src/draive/generation/model/state.py (5)
src/draive/parameters/model.py (2)
  • json_schema (362-370)
  • simplified_schema (352-359)
src/draive/parameters/schema.py (1)
  • simplified_schema (9-26)
src/draive/multimodal/templates/types.py (3)
  • Template (39-139)
  • of (56-84)
  • of (162-194)
src/draive/multimodal/templates/repository.py (7)
  • TemplatesRepository (61-435)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
src/draive/multimodal/content.py (3)
  • of (41-65)
  • of (618-646)
  • MultimodalContent (24-591)
src/draive/generation/audio/default.py (3)
src/draive/models/types.py (4)
  • ModelOutput (573-672)
  • ModelInput (429-501)
  • content (474-478)
  • content (618-622)
src/draive/models/generative.py (5)
  • completion (62-71)
  • completion (74-83)
  • completion (87-96)
  • completion (99-108)
  • completion (111-182)
src/draive/multimodal/content.py (3)
  • of (41-65)
  • of (618-646)
  • audio (160-175)
src/draive/generation/model/default.py (4)
src/draive/models/types.py (4)
  • ModelOutput (573-672)
  • ModelInput (429-501)
  • content (474-478)
  • content (618-622)
src/draive/models/generative.py (5)
  • loop (186-195)
  • loop (198-207)
  • loop (211-220)
  • loop (223-232)
  • loop (235-307)
src/draive/multimodal/content.py (6)
  • of (41-65)
  • of (618-646)
  • MultimodalContent (24-591)
  • artifacts (207-212)
  • artifacts (215-221)
  • artifacts (223-272)
src/draive/parameters/model.py (1)
  • from_json (373-388)
src/draive/generation/text/default.py (2)
src/draive/models/generative.py (6)
  • GenerativeModel (45-517)
  • loop (186-195)
  • loop (198-207)
  • loop (211-220)
  • loop (223-232)
  • loop (235-307)
src/draive/stages/stage.py (1)
  • loop (1075-1161)
src/draive/conversation/completion/default.py (6)
src/draive/models/types.py (8)
  • ModelMemoryRecall (719-766)
  • ModelInput (429-501)
  • content (474-478)
  • content (618-622)
  • ModelOutput (573-672)
  • content_with_reasoning (625-646)
  • ModelReasoning (518-562)
  • ModelToolRequest (311-353)
src/draive/utils/memory.py (6)
  • recall (70-73)
  • recall (76-79)
  • recall (82-87)
  • remember (91-95)
  • remember (98-102)
  • remember (105-111)
src/draive/postgres/memory.py (2)
  • recall (68-89)
  • remember (91-145)
src/draive/conversation/types.py (6)
  • of (33-52)
  • of (78-89)
  • of (117-130)
  • ConversationMessage (139-200)
  • model (175-190)
  • ConversationOutputChunk (64-97)
src/draive/models/generative.py (5)
  • loop (186-195)
  • loop (198-207)
  • loop (211-220)
  • loop (223-232)
  • loop (235-307)
src/draive/multimodal/artifact.py (1)
  • ArtifactContent (11-96)
🔇 Additional comments (32)
src/draive/evaluation/scenario.py (1)

11-11: LGTM: Import addition supports enhanced telemetry.

The ObservabilityLevel import is correctly added to support granular telemetry control in the ctx.record call below, aligning with the PR's observability enhancements.

src/draive/ollama/embedding.py (1)

39-42: LGTM! Enhanced observability for embedding configuration.

The DEBUG-level recording of the embedding configuration provides valuable telemetry for troubleshooting without cluttering production logs. This follows the established pattern of INFO for high-level attributes and DEBUG for detailed config.

src/draive/generation/audio/state.py (1)

42-55: LGTM! Template traceability added to audio generation.

The scoped context with template identifier recording enables tracing template usage throughout the generation flow. This is consistent with similar changes in image and text generation modules.

src/draive/openai/realtime.py (1)

109-117: LGTM! Explicit observability level for realtime session.

Adding ObservabilityLevel.INFO ensures consistent telemetry granularity for session lifecycle events.

pyproject.toml (2)

8-8: Version bump aligns with observability enhancements.

The patch version increment from 0.91.1 to 0.91.2 appropriately reflects the backward-compatible telemetry improvements in this PR.


27-27: Haiway dependency update enables new observability features.

The upgrade to haiway 0.37.5 provides the ObservabilityLevel enhancements used throughout this PR.

src/draive/generation/image/state.py (1)

42-55: LGTM! Consistent template traceability for image generation.

The implementation mirrors the audio/text generation patterns, ensuring uniform observability across all generation types.

src/draive/models/tools/function.py (3)

276-284: LGTM! Well-structured observability for tool calls.

The tiered approach—INFO for call identity, DEBUG for detailed arguments—provides the right balance between visibility and verbosity. The __debug__ guard ensures production efficiency.


289-292: LGTM! Result logging aids debugging.

DEBUG-level result recording under __debug__ provides valuable insight during development without impacting production performance.


301-308: LGTM! Consistent error telemetry.

Both exception handlers now record with ObservabilityLevel.ERROR, ensuring errors are captured at the appropriate severity for alerting and monitoring.

Also applies to: 322-329

src/draive/helpers/volatile_vector_index.py (1)

32-32: LGTM! Type refinement improves API clarity.

Changing from Iterable[Model] to Collection[Model] provides better type safety by requiring sized, containable types while remaining compatible with the existing implementation that uses as_tuple(values).

src/draive/generation/audio/default.py (1)

21-32: LGTM! Scope management moved to state layer.

The removal of the ctx.scope wrapper aligns with the PR's objective to migrate context scoping from default implementations to state-layer functions. The generation logic, result processing, and error handling remain functionally equivalent.

src/draive/generation/text/default.py (1)

24-43: LGTM! Consistent with scope migration pattern.

The removal of the surrounding ctx.scope and direct invocation of GenerativeModel.loop follows the same refactoring pattern seen across generation defaults. Context construction and result extraction logic are preserved.

src/draive/generation/text/state.py (1)

48-63: LGTM! Proper observability instrumentation at state layer.

The introduction of ctx.scope("generate_text") and template identifier recording via ctx.record correctly implements the PR's objective to centralize observability at the state layer. Template resolution occurs within the scope before delegating to the underlying generation method.

Based on coding guidelines.

src/draive/generation/image/default.py (1)

21-32: LGTM! Scope removal consistent with refactoring pattern.

The removal of ctx.scope wrapper and direct invocation of GenerativeModel.completion aligns with the PR's systematic migration of observability to state-layer functions. Result processing logic is preserved.

src/draive/utils/vector_index.py (4)

88-100: LGTM! Proper telemetry instrumentation.

The ctx.record call correctly logs the indexing operation with model name and value count. The event name and attributes follow a consistent naming pattern.

Based on coding guidelines.


164-179: LGTM! Search telemetry properly records query presence.

The recording of boolean flags for query and requirements presence (rather than the actual values) is appropriate for observability without leaking potentially sensitive data.

Based on coding guidelines.


211-222: LGTM! Delete operation properly instrumented.

The telemetry correctly logs the model and whether requirements are specified, maintaining consistency with the search operation pattern.

Based on coding guidelines.


1-23: Confirm intentional breaking change and review adopter impact.

The change from Iterable[Model] to Collection[Model] is confirmed as a breaking change. The len(values) call on line 91 objectively justifies the stricter type requirement, as Iterable does not guarantee __len__ support.

Findings:

  • Both concrete implementations (PostgresVectorIndex, VolatileVectorIndex) consistently use Collection[Model]
  • VolatileVectorIndex internally converts values to Sequence via as_tuple() (line 35)
  • No callers of .index() found within the repository
  • VectorIndex is exported as public API (__all__)

Action items:

  1. Confirm this breaking change is intentional for the version being released
  2. Document migration guidance for external adopters (generators/iterators must be wrapped as lists/tuples)
  3. Consider if a deprecation period or major version bump is warranted given this is a public API
src/draive/generation/model/default.py (1)

31-68: LGTM! Scope removal with preserved logic.

The removal of the ctx.scope wrapper aligns with the PR's migration pattern. The decoding logic (custom decoder → direct artifact → fallback JSON) and error handling are fully preserved, including all logging statements.

src/draive/conversation/completion/state.py (1)

91-138: LGTM! Consolidated memory and template handling.

The addition of ctx.scope("conversation_completion") centralizes observability, and the memory processing logic correctly handles all three cases (None, Memory instance, iterable). Template identifier recording and variable-based instruction resolution are properly implemented.

Based on coding guidelines.

src/draive/stages/stage.py (5)

349-353: LGTM! Template identifier recording for observability.

The conditional recording of template identifiers for both instructions and input when they are Template instances correctly implements the PR's observability enhancement pattern.

Based on coding guidelines.


452-453: LGTM! Consistent template tracking.

Template identifier recording for instructions follows the same pattern as Stage.completion.

Based on coding guidelines.


547-548: LGTM! Loopback completion instrumented.

Template recording is consistent with other completion stages.

Based on coding guidelines.


616-617: LGTM! Result completion instrumented.

Template identifier recording completes the consistent instrumentation across all completion stage variants.

Based on coding guidelines.


2088-2092: LGTM! Direct instruction passing is correct.

At line 2089, instructions is already a string (constructed at lines 2068-2074), so passing it directly to GenerativeModel.loop is correct. No template resolution is needed.

src/draive/generation/model/state.py (1)

61-100: Excellent observability instrumentation!

The context scoping and attribute recording provide comprehensive telemetry for model generation flows. The implementation correctly:

  • Records the generated model's qualified name for tracking
  • Instruments each schema injection path ("full", "simplified", "skip")
  • Captures template identifiers before resolution for traceability
  • Resolves templates with proper argument merging

The scoped execution aligns with the codebase's observability patterns.

Based on learnings and coding guidelines.

src/draive/conversation/completion/default.py (2)

89-123: Clean refactoring with clear control flow.

The linearized flow improves readability while maintaining correct memory handling. The error handling around remember() ensures memory failures don't go unnoticed.

One observation: When remember() fails in line 113-120, the exception is re-raised after the response_message has been constructed but before it's returned. This means the caller won't receive the message even though generation succeeded. Verify this is the intended behavior—that memory persistence failures should fail the entire operation rather than returning the message with a warning.

Based on coding guidelines.


146-158: Proper handling of ModelReasoning in streaming.

Wrapping ModelReasoning chunks as hidden ArtifactContent with category "reasoning" is consistent with the pattern in ModelOutput.content_with_reasoning and allows reasoning data to flow through the stream without being rendered directly.

src/draive/conversation/realtime/state.py (1)

62-104: Well-structured scoped preparation with comprehensive observability.

The refactored memory handling is clearer:

  • Direct use of Memory instances when provided
  • Clean conversion of ConversationMessage iterables to model context elements
  • Proper recording of instruction template identifiers for traceability

The template resolution with memory_variables as arguments (lines 94-98) correctly handles the case where variables need string conversion for template substitution.

Based on learnings and coding guidelines.

src/draive/postgres/vector_index.py (2)

77-77: Better type precision with Collection.

Changing from Iterable[Model] to Collection[Model] is a good refinement. Collection is more appropriate here because:

  • The values are used in multiple iterations and zip operations requiring size matching
  • Removes the need for as_tuple conversion
  • Provides better type guarantees (sized, iterable, container)

As per coding guidelines preferring precise types.


92-140: Clean removal of unnecessary tuple conversion.

Direct iteration over values (lines 92, 120, 137) eliminates the overhead of the previous as_tuple conversion while maintaining correctness. The Collection type ensures the values can be iterated multiple times safely for both embedding and zip operations.

@KaQuMiQ KaQuMiQ force-pushed the feature/metrics branch 2 times, most recently from 23b7e45 to 4ab339c Compare November 4, 2025 14:08
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
src/draive/conversation/completion/state.py (1)

143-161: Consider consolidating the stream branching.

The if stream and else branches are nearly identical, differing only in the stream parameter value. This duplication can be eliminated by passing the stream variable directly.

Apply this diff:

-            if stream:
-                return await ctx.state(cls).completing(
-                    instructions=model_instructions,
-                    toolbox=Toolbox.of(tools),
-                    memory=conversation_memory,
-                    input=conversation_message,
-                    stream=True,
-                    **extra,
-                )
-
-            else:
-                return await ctx.state(cls).completing(
-                    instructions=model_instructions,
-                    toolbox=Toolbox.of(tools),
-                    memory=conversation_memory,
-                    input=conversation_message,
-                    stream=False,
-                    **extra,
-                )
+            return await ctx.state(cls).completing(
+                instructions=model_instructions,
+                toolbox=Toolbox.of(tools),
+                memory=conversation_memory,
+                input=conversation_message,
+                stream=stream,
+                **extra,
+            )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fe72609 and 4ab339c.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (21)
  • pyproject.toml (2 hunks)
  • src/draive/conversation/completion/default.py (2 hunks)
  • src/draive/conversation/completion/state.py (2 hunks)
  • src/draive/conversation/realtime/state.py (2 hunks)
  • src/draive/evaluation/scenario.py (2 hunks)
  • src/draive/generation/audio/default.py (1 hunks)
  • src/draive/generation/audio/state.py (2 hunks)
  • src/draive/generation/image/default.py (1 hunks)
  • src/draive/generation/image/state.py (2 hunks)
  • src/draive/generation/model/default.py (1 hunks)
  • src/draive/generation/model/state.py (2 hunks)
  • src/draive/generation/text/default.py (1 hunks)
  • src/draive/generation/text/state.py (2 hunks)
  • src/draive/helpers/volatile_vector_index.py (4 hunks)
  • src/draive/models/tools/function.py (4 hunks)
  • src/draive/ollama/embedding.py (1 hunks)
  • src/draive/openai/realtime.py (1 hunks)
  • src/draive/postgres/vector_index.py (5 hunks)
  • src/draive/stages/stage.py (6 hunks)
  • src/draive/utils/memory.py (4 hunks)
  • src/draive/utils/vector_index.py (7 hunks)
🧰 Additional context used
📓 Path-based instructions (9)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use Python 3.12+ features and syntax across the codebase
Format code exclusively with Ruff (make format); do not use other formatters
Skip module-level docstrings

Files:

  • src/draive/conversation/completion/default.py
  • src/draive/evaluation/scenario.py
  • src/draive/generation/image/default.py
  • src/draive/postgres/vector_index.py
  • src/draive/conversation/completion/state.py
  • src/draive/generation/model/state.py
  • src/draive/utils/memory.py
  • src/draive/models/tools/function.py
  • src/draive/generation/audio/state.py
  • src/draive/generation/audio/default.py
  • src/draive/conversation/realtime/state.py
  • src/draive/ollama/embedding.py
  • src/draive/generation/text/default.py
  • src/draive/stages/stage.py
  • src/draive/generation/text/state.py
  • src/draive/generation/model/default.py
  • src/draive/helpers/volatile_vector_index.py
  • src/draive/openai/realtime.py
  • src/draive/generation/image/state.py
  • src/draive/utils/vector_index.py
src/draive/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/draive/**/*.py: Import Haiway symbols directly (from haiway import State, ctx)
Use ctx.scope(...) to bind scoped Disposables and active State; avoid global state
Route all logs through ctx.log_debug/info/warn/error; do not use print
Use latest, most strict typing syntax (Python 3.12+), with strict typing only for public APIs
Avoid loose Any except at explicit third‑party boundaries
Prefer explicit attribute access with static types; avoid dynamic getattr except at narrow boundaries
Prefer Mapping/Sequence/Iterable in public types over dict/list/set
Use final where applicable; avoid inheritance and prefer composition
Use precise unions (|) and narrow with match/isinstance; avoid cast unless provably safe and localized
Model immutable data/config and facades with haiway.State; provide ergonomic classmethods like .of(...)
Avoid in-place mutation; use State.updated(...) or functional builders to produce new instances
Access active state via haiway.ctx inside async scopes (ctx.scope(...))
Use @statemethod for public state methods that dispatch on the active instance
Log around generation calls, tool dispatch, and provider requests/responses without leaking secrets; prefer structured/concise messages
Add metrics via ctx.record where applicable
All I/O is async; keep boundaries async and use ctx.spawn for detached tasks
Use structured concurrency and valid coroutine usage; rely on haiway/asyncio; avoid custom threading
Construct multimodal content with MultimodalContent.of(...) and compose blocks explicitly
Use ResourceContent/ResourceReference for media/data blobs
Wrap custom types/data within ArtifactContent; use hidden when needed
Add NumPy-style docstrings for public symbols with Parameters/Returns/Raises and rationale when non-obvious
Avoid docstrings on internal helpers; keep names self-explanatory
Keep docstrings high-quality; mkdocstrings pulls them into API reference
Never log secrets or full request bodies containing keys/tokens

Files:

  • src/draive/conversation/completion/default.py
  • src/draive/evaluation/scenario.py
  • src/draive/generation/image/default.py
  • src/draive/postgres/vector_index.py
  • src/draive/conversation/completion/state.py
  • src/draive/generation/model/state.py
  • src/draive/utils/memory.py
  • src/draive/models/tools/function.py
  • src/draive/generation/audio/state.py
  • src/draive/generation/audio/default.py
  • src/draive/conversation/realtime/state.py
  • src/draive/ollama/embedding.py
  • src/draive/generation/text/default.py
  • src/draive/stages/stage.py
  • src/draive/generation/text/state.py
  • src/draive/generation/model/default.py
  • src/draive/helpers/volatile_vector_index.py
  • src/draive/openai/realtime.py
  • src/draive/generation/image/state.py
  • src/draive/utils/vector_index.py
src/draive/conversation/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Implement higher-level chat/realtime conversations under draive/conversation/

Files:

  • src/draive/conversation/completion/default.py
  • src/draive/conversation/completion/state.py
  • src/draive/conversation/realtime/state.py
src/draive/{httpx,mcp,postgres,opentelemetry}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Place integrations under draive/httpx, draive/mcp, draive/postgres, draive/opentelemetry

Files:

  • src/draive/postgres/vector_index.py
src/draive/utils/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Keep utilities (e.g., Memory, VectorIndex) under draive/utils/

Files:

  • src/draive/utils/memory.py
  • src/draive/utils/vector_index.py
src/draive/models/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Keep core abstractions (GenerativeModel, tools, instructions) under draive/models/

Files:

  • src/draive/models/tools/function.py
src/draive/{openai,anthropic,mistral,gemini,vllm,ollama,bedrock,cohere}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/draive/{openai,anthropic,mistral,gemini,vllm,ollama,bedrock,cohere}/**/*.py: Provider-specific feature modules live under their respective provider directories
Translate provider/SDK errors into typed exceptions; do not raise bare Exception and preserve context
Use environment variables for credentials and resolve via helper functions like getenv_str

Files:

  • src/draive/ollama/embedding.py
  • src/draive/openai/realtime.py
src/draive/stages/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Implement pipeline stage abstractions and helpers under draive/stages/

Files:

  • src/draive/stages/stage.py
{pyproject.toml,pyrightconfig.json}

📄 CodeRabbit inference engine (AGENTS.md)

Use Ruff, Bandit, and Pyright (strict) via make lint

Files:

  • pyproject.toml
🧠 Learnings (17)
📓 Common learnings
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Access active state via haiway.ctx inside async scopes (ctx.scope(...))
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/conversation/**/*.py : Implement higher-level chat/realtime conversations under draive/conversation/

Applied to files:

  • src/draive/conversation/completion/default.py
  • src/draive/conversation/completion/state.py
  • src/draive/conversation/realtime/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Add metrics via ctx.record where applicable

Applied to files:

  • src/draive/evaluation/scenario.py
  • src/draive/utils/memory.py
  • src/draive/ollama/embedding.py
  • src/draive/generation/text/state.py
  • src/draive/openai/realtime.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Access active state via haiway.ctx inside async scopes (ctx.scope(...))

Applied to files:

  • src/draive/generation/image/default.py
  • src/draive/conversation/completion/state.py
  • src/draive/generation/model/state.py
  • src/draive/utils/memory.py
  • src/draive/models/tools/function.py
  • src/draive/generation/audio/state.py
  • src/draive/generation/audio/default.py
  • src/draive/generation/text/default.py
  • src/draive/generation/text/state.py
  • src/draive/generation/image/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Import Haiway symbols directly (from haiway import State, ctx)

Applied to files:

  • src/draive/generation/image/default.py
  • src/draive/conversation/completion/state.py
  • src/draive/generation/model/state.py
  • src/draive/utils/memory.py
  • src/draive/generation/audio/state.py
  • src/draive/generation/audio/default.py
  • src/draive/generation/text/state.py
  • src/draive/generation/image/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/embedding/**/*.py : Keep vector ops, similarity, indexing, and typed embedding states under draive/embedding/

Applied to files:

  • src/draive/postgres/vector_index.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Use ctx.scope(...) to bind scoped Disposables and active State; avoid global state

Applied to files:

  • src/draive/conversation/completion/state.py
  • src/draive/generation/model/state.py
  • src/draive/generation/text/state.py
  • src/draive/generation/image/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to tests/**/*.py : Prefer scoping with ctx.scope(...) in async tests and bind required State instances explicitly

Applied to files:

  • src/draive/conversation/completion/state.py
  • src/draive/generation/image/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Use structured concurrency and valid coroutine usage; rely on haiway/asyncio; avoid custom threading

Applied to files:

  • src/draive/conversation/completion/state.py
  • src/draive/models/tools/function.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Log around generation calls, tool dispatch, and provider requests/responses without leaking secrets; prefer structured/concise messages

Applied to files:

  • src/draive/generation/model/state.py
  • src/draive/models/tools/function.py
  • src/draive/generation/audio/state.py
  • src/draive/generation/text/state.py
  • src/draive/generation/model/default.py
  • src/draive/generation/image/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Model immutable data/config and facades with haiway.State; provide ergonomic classmethods like .of(...)

Applied to files:

  • src/draive/generation/model/state.py
  • src/draive/utils/memory.py
  • src/draive/models/tools/function.py
  • src/draive/generation/audio/state.py
  • src/draive/generation/text/state.py
  • src/draive/generation/image/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/generation/**/*.{py} : Place typed generation facades and wiring (state.py, types.py, default.py) under draive/generation/

Applied to files:

  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Prefer Mapping/Sequence/Iterable in public types over dict/list/set

Applied to files:

  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Use statemethod for public state methods that dispatch on the active instance

Applied to files:

  • src/draive/generation/audio/state.py
  • src/draive/generation/image/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/stages/**/*.py : Implement pipeline stage abstractions and helpers under draive/stages/

Applied to files:

  • src/draive/stages/stage.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/models/**/*.py : Keep core abstractions (GenerativeModel, tools, instructions) under draive/models/

Applied to files:

  • src/draive/generation/model/default.py
📚 Learning: 2025-06-16T10:28:07.434Z
Learnt from: KaQuMiQ
Repo: miquido/draive PR: 338
File: src/draive/lmm/__init__.py:1-2
Timestamp: 2025-06-16T10:28:07.434Z
Learning: The draive project requires Python 3.12+ as specified in pyproject.toml with "requires-python = ">=3.12"" and uses Python 3.12+ specific features like PEP 695 type aliases and generic syntax extensively throughout the codebase.

Applied to files:

  • pyproject.toml
🧬 Code graph analysis (13)
src/draive/conversation/completion/default.py (6)
src/draive/models/types.py (8)
  • ModelMemoryRecall (719-766)
  • ModelInput (429-501)
  • content (474-478)
  • content (618-622)
  • ModelOutput (573-672)
  • content_with_reasoning (625-646)
  • ModelReasoning (518-562)
  • ModelToolRequest (311-353)
src/draive/utils/memory.py (6)
  • recall (70-73)
  • recall (76-79)
  • recall (82-90)
  • remember (94-98)
  • remember (101-105)
  • remember (108-117)
src/draive/postgres/memory.py (2)
  • recall (68-89)
  • remember (91-145)
src/draive/conversation/types.py (6)
  • of (33-52)
  • of (78-89)
  • of (117-130)
  • ConversationMessage (139-200)
  • model (175-190)
  • ConversationOutputChunk (64-97)
src/draive/models/generative.py (6)
  • GenerativeModel (45-517)
  • loop (186-195)
  • loop (198-207)
  • loop (211-220)
  • loop (223-232)
  • loop (235-307)
src/draive/multimodal/artifact.py (1)
  • ArtifactContent (11-96)
src/draive/generation/image/default.py (3)
src/draive/models/types.py (4)
  • ModelOutput (573-672)
  • ModelInput (429-501)
  • content (474-478)
  • content (618-622)
src/draive/models/generative.py (5)
  • completion (62-71)
  • completion (74-83)
  • completion (87-96)
  • completion (99-108)
  • completion (111-182)
src/draive/multimodal/content.py (3)
  • of (41-65)
  • of (618-646)
  • images (143-158)
src/draive/conversation/completion/state.py (3)
src/draive/multimodal/templates/repository.py (7)
  • TemplatesRepository (61-435)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
src/draive/utils/memory.py (5)
  • constant (54-66)
  • Memory (52-145)
  • recall (70-73)
  • recall (76-79)
  • recall (82-90)
src/draive/stages/stage.py (1)
  • memory_recall (192-246)
src/draive/generation/model/state.py (3)
src/draive/multimodal/templates/types.py (3)
  • Template (39-139)
  • of (56-84)
  • of (162-194)
src/draive/multimodal/templates/repository.py (7)
  • TemplatesRepository (61-435)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
src/draive/models/tools/toolbox.py (1)
  • Toolbox (20-467)
src/draive/generation/audio/state.py (1)
src/draive/multimodal/templates/types.py (1)
  • Template (39-139)
src/draive/generation/audio/default.py (3)
src/draive/models/types.py (4)
  • ModelOutput (573-672)
  • ModelInput (429-501)
  • content (474-478)
  • content (618-622)
src/draive/models/generative.py (6)
  • GenerativeModel (45-517)
  • completion (62-71)
  • completion (74-83)
  • completion (87-96)
  • completion (99-108)
  • completion (111-182)
src/draive/multimodal/content.py (3)
  • of (41-65)
  • of (618-646)
  • audio (160-175)
src/draive/conversation/realtime/state.py (5)
src/draive/utils/memory.py (5)
  • Memory (52-145)
  • recall (70-73)
  • recall (76-79)
  • recall (82-90)
  • constant (54-66)
src/draive/models/types.py (1)
  • ModelMemoryRecall (719-766)
src/draive/multimodal/templates/types.py (2)
  • of (56-84)
  • Template (39-139)
src/draive/multimodal/templates/repository.py (3)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
src/draive/models/tools/toolbox.py (1)
  • Toolbox (20-467)
src/draive/ollama/embedding.py (1)
src/draive/parameters/model.py (1)
  • to_mapping (462-478)
src/draive/generation/text/default.py (3)
src/draive/models/types.py (2)
  • ModelOutput (573-672)
  • ModelInput (429-501)
src/draive/models/generative.py (5)
  • loop (186-195)
  • loop (198-207)
  • loop (211-220)
  • loop (223-232)
  • loop (235-307)
src/draive/multimodal/content.py (5)
  • of (41-65)
  • of (618-646)
  • MultimodalContent (24-591)
  • to_str (287-296)
  • to_str (652-665)
src/draive/stages/stage.py (1)
src/draive/multimodal/templates/types.py (1)
  • Template (39-139)
src/draive/generation/text/state.py (2)
src/draive/multimodal/templates/types.py (3)
  • Template (39-139)
  • of (56-84)
  • of (162-194)
src/draive/multimodal/templates/repository.py (6)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
src/draive/generation/model/default.py (4)
src/draive/models/types.py (4)
  • ModelOutput (573-672)
  • ModelInput (429-501)
  • content (474-478)
  • content (618-622)
src/draive/models/generative.py (5)
  • loop (186-195)
  • loop (198-207)
  • loop (211-220)
  • loop (223-232)
  • loop (235-307)
src/draive/multimodal/content.py (6)
  • of (41-65)
  • of (618-646)
  • MultimodalContent (24-591)
  • artifacts (207-212)
  • artifacts (215-221)
  • artifacts (223-272)
src/draive/parameters/model.py (1)
  • to_json (420-436)
src/draive/generation/image/state.py (2)
src/draive/multimodal/templates/types.py (1)
  • Template (39-139)
src/draive/multimodal/templates/repository.py (6)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
🔇 Additional comments (32)
src/draive/models/tools/function.py (3)

4-4: LGTM: Import updated correctly.

The addition of ObservabilityLevel is necessary for the enhanced observability instrumentation below.


276-279: LGTM: Appropriate observability record.

Recording the tool invocation at INFO level with call_id provides good traceability without exposing sensitive data.


301-334: Verify that exception messages don't expose sensitive data.

The ERROR-level records include exception messages in their attributes (lines 306 and 327). While important for debugging, exception messages can sometimes contain sensitive information (connection strings, API keys, file paths, PII).

Ensure that:

  1. Tools are implemented to avoid including secrets in exception messages
  2. Upstream code doesn't pass sensitive data that could appear in errors
  3. The formatted error content (line 317) properly redacts sensitive information via format_error

Note: The exception handling structure correctly distinguishes between regular exceptions (wrapped in ToolError) and system exceptions like KeyboardInterrupt (re-raised), which is good practice.

src/draive/ollama/embedding.py (1)

39-42: The logging pattern is secure; no sensitive fields are exposed.

The OllamaEmbeddingConfig class contains only non-sensitive configuration parameters (model name, concurrent flag, batch_size). The to_mapping() method serializes exactly these three fields—all safe to log. Ollama credentials are managed separately via base URL and headers, not stored in embedding configuration. The DEBUG-level logging properly follows the two-tier pattern without exposing secrets.

src/draive/generation/audio/state.py (1)

42-61: Scoped telemetry coverage looks great.
The new ctx.scope block and template recordings capture the right identifiers without disturbing the generation flow.

src/draive/openai/realtime.py (1)

110-118: Explicit observability level is spot on.
Passing ObservabilityLevel.INFO keeps the realtime session telemetry consistent with the revised ctx.record API.

src/draive/evaluation/scenario.py (1)

403-412: Scenario telemetry addition looks solid.
Recording performance as a histogram with pass status and evaluator list gives actionable metrics while respecting the new observability contract.

src/draive/generation/audio/default.py (1)

21-32: Helper stays clean and aligned with state scope.
Relying on the state-layer scope while keeping the completion call untouched maintains behavior and avoids redundant nesting.

src/draive/generation/image/state.py (1)

42-61: Consistent observability for image generation.
The scoped execution and template recordings bring image generation in line with the broader telemetry upgrades.

src/draive/generation/text/state.py (1)

48-69: LGTM! Clean observability instrumentation.

The addition of the ctx.scope("generate_text") wrapper with template identifier recording provides valuable telemetry without affecting the generation logic. The pattern of recording template identifiers before resolution is consistent with the broader observability enhancements in this PR.

src/draive/generation/model/default.py (1)

31-68: LGTM! Scope removal aligns with refactoring pattern.

The removal of the ctx.scope wrapper is consistent with the broader refactoring in this PR, where observability scopes are moved from low-level default.py implementations to higher-level state.py facades. The generation logic, context assembly, and error handling remain functionally identical.

src/draive/generation/text/default.py (1)

24-43: LGTM! Consistent scope removal.

The removal of the ctx.scope wrapper follows the same refactoring pattern observed across generation modules. The generation logic and context assembly remain unchanged, with scoping now handled at the state layer.

src/draive/utils/vector_index.py (2)

95-102: LGTM! Observability instrumentation enhances vector operations telemetry.

The ctx.record calls provide valuable metrics for vector index operations, capturing model types, value counts, and query/requirements presence at appropriate granularity. The use of ObservabilityLevel.INFO is consistent with other instrumentation in this PR.

Also applies to: 172-180, 220-227


1-1: Collection constraint is justified and does not break existing callers in the codebase.

Both PostgresVectorIndex and VolatileVectorIndex implementations already require Collection semantics: they iterate over values in a for loop, then pass it again to zip(..., strict=True), which requires the sequence to be repeatable. Generators cannot satisfy this pattern—they are single-use iterables that would fail at runtime in the existing code if passed.

No callers in the codebase pass generators or pure iterators to the .index() method. The type signature change from Iterable to Collection makes an implicit runtime requirement explicit, improving type safety without introducing a breaking change to actual usage.

src/draive/generation/image/default.py (1)

21-32: LGTM! Scope removal follows established pattern.

The removal of the ctx.scope wrapper is consistent with the refactoring pattern applied across generation modules. The image generation and error handling logic remain functionally equivalent.

pyproject.toml (1)

8-8: LGTM! Version bumps support observability enhancements.

The version bump to 0.91.2 and the haiway dependency update to 0.37.6 align with the observability instrumentation introduced throughout this PR. The minor version increment is appropriate for the additive changes.

Also applies to: 27-27

src/draive/utils/memory.py (1)

86-89: LGTM! Observability instrumentation properly integrated.

The ctx.record calls with ObservabilityLevel.INFO provide consistent telemetry for memory lifecycle operations. The implementation addresses the feedback from previous review rounds and follows the same observability pattern used throughout this PR.

Also applies to: 113-116, 137-140

src/draive/generation/model/state.py (1)

61-118: LGTM! Comprehensive observability instrumentation.

The scoped context with multi-attribute recording provides excellent visibility into model generation flows, capturing the generated model type, schema injection strategy, and template identifiers. The implementation follows the established pattern of adding observability at the state layer.

src/draive/postgres/vector_index.py (2)

3-3: LGTM: Import updates support Collection-based refactoring.

The import changes correctly align with the parameter type update and removal of the as_tuple conversion helper.

Also applies to: 7-7


77-77: Verify external callers can provide Collection inputs for this breaking API change.

The type change from Iterable[Model] to Collection[Model] is semantically necessary—the implementation iterates values twice (line 92 for selection, then lines 120/137 for embedding pairing). However, this is a breaking change for any external callers passing generators or single-pass iterables. No call sites were found in the codebase; verify that all downstream consumers can safely pass collections (lists, tuples, sets) rather than generators.

src/draive/helpers/volatile_vector_index.py (1)

3-3: Verify that callers provide Collection-compatible inputs.

The parameter type changed from Iterable[Model] to Collection[Model], mirroring the change in postgres/vector_index.py. This is necessary because the implementation iterates over values multiple times (line 49 for selection, line 86 for embedding pairing). Ensure all call sites pass collections rather than generators or single-pass iterables.

As per coding guidelines: prefer Mapping/Sequence/Iterable in public types, but Collection is appropriate here due to the multiple-iteration requirement.

Also applies to: 6-6, 32-32, 49-49, 86-86

src/draive/conversation/completion/state.py (2)

4-4: LGTM: Observability instrumentation aligns with project patterns.

The addition of ObservabilityLevel and wrapping the completion flow in ctx.scope("conversation_completion") is consistent with the project's observability strategy and coding guidelines.

Based on learnings and coding guidelines.

Also applies to: 91-91


127-131: LGTM: Template identifier recording enhances observability.

Recording the template identifier when instructions is a Template provides useful observability without affecting behavior. This pattern is consistently applied across the codebase.

src/draive/stages/stage.py (2)

19-19: LGTM: Consistent template observability across completion methods.

The template identifier recording is consistently applied across all completion methods (completion, prompting_completion, loopback_completion, result_completion). Recording both instructions.template and input.template where applicable provides comprehensive observability without affecting behavior.

Also applies to: 350-360, 459-463, 557-561, 629-633


2075-2108: LGTM: Simplified _model_routing instructions handling.

The change at line 2105 correctly passes the instructions string directly to GenerativeModel.loop without unnecessary re-resolution. The instructions variable is already a plain string constructed from options_text, so the previous await TemplatesRepository.resolve_str(instructions) was redundant.

src/draive/conversation/completion/default.py (2)

89-123: LGTM: Refactored completion flow is clear and explicit.

The refactoring inlines memory recall and context preparation, improving readability by removing nested scopes. The explicit error handling for memory.remember() (lines 112-120) logs the error and re-raises, ensuring proper error propagation while maintaining observability.


133-171: LGTM: Streaming refactoring maintains consistency and proper artifact handling.

The streaming path refactoring mirrors the non-streaming path with inline memory recall and context preparation. The handling of ModelReasoning chunks (lines 146-154) correctly wraps them as hidden ArtifactContent, consistent with the ModelOutput.content_with_reasoning pattern. The remember() call after streaming completes ensures context persistence with proper error handling.

src/draive/conversation/realtime/state.py (5)

4-4: LGTM: Import addition supports observability.

The ObservabilityLevel import is correctly added to support the ctx.record call introduced later in the file.


62-63: LGTM: Proper scope usage for observability.

The ctx.scope("conversation_realtime") correctly wraps the preparation logic and reads the active state, following the coding guidelines for haiway state management.


74-85: LGTM: Memory context conversion logic is correct.

The generator function properly converts conversation messages to model context elements (user messages → ModelInput, others → ModelOutput), and the constant memory construction using ModelMemory.constant(ModelMemoryRecall.of(*model_context_elements())) follows the expected pattern.


87-91: LGTM: Appropriate observability instrumentation.

The ctx.record call correctly tracks template usage with the identifier, following the coding guidelines to add metrics via ctx.record where applicable. No sensitive data is logged.


93-107: LGTM: Template resolution and preparation call are correct.

The template resolution with TemplatesRepository.resolve_str properly handles both Template and plain string instructions. The conditional expression for arguments (lines 96-101) safely converts memory variables to strings and correctly handles None or empty dictionaries. All parameters are appropriately passed to the preparing call.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/draive/evaluation/scenario.py (1)

402-412: Move telemetry recording inside the scope block.

The ctx.record call at lines 402-412 is outside the ctx.scope block (which ends at line 400). This is inconsistent with codebase patterns where ctx.record is always called within the active scope. Moving the metric recording inside the scope ensures proper scope context and attribute binding.

The .evaluator attribute on line 410 is confirmed to exist on EvaluatorResult instances—no changes needed there.

Suggested fix:

    async def __call__(
        self,
        value: Value,
        /,
        *args: Args.args,
        **kwargs: Args.kwargs,
    ) -> EvaluatorScenarioResult:
        async with ctx.scope(f"evaluator.scenario.{self.name}", *self._state):
            result: EvaluatorScenarioResult = await self._evaluate(
                value,
                *args,
                **kwargs,
            )
-
-        ctx.record(
-            ObservabilityLevel.INFO,
-            metric=f"evaluator.scenario.{result.scenario}.performance",
-            value=result.performance,
-            unit="%",
-            kind="histogram",
-            attributes={
-                "passed": result.passed,
-                "evaluators": [result.evaluator for result in result.results],
-            },
-        )
+            ctx.record(
+                ObservabilityLevel.INFO,
+                metric=f"evaluator.scenario.{result.scenario}.performance",
+                value=result.performance,
+                unit="%",
+                kind="histogram",
+                attributes={
+                    "passed": result.passed,
+                    "evaluators": [result.evaluator for result in result.results],
+                },
+            )
         return result
♻️ Duplicate comments (2)
src/draive/models/tools/function.py (1)

280-292: Previous security concerns remain unaddressed.

The concerns raised in the previous review regarding unfiltered logging of arguments and results have not been resolved:

  1. Line 283 still contains unnecessary dict unpacking ({**{...}}).
  2. Raw argument values (line 283) and results (line 291) continue to be logged in debug mode without redaction, risking exposure of secrets or PII in development and staging environments.

As per coding guidelines: "Never log secrets or full request bodies containing keys/tokens"

Please address the previous review comment by implementing one of the suggested solutions before merging.

src/draive/conversation/completion/state.py (1)

143-161: Consider consolidating the stream branching.

The stream/non-stream branches are nearly identical except for the stream parameter. While explicit, this creates duplication.

Apply this diff to reduce duplication:

-            if stream:
-                return await ctx.state(cls).completing(
-                    instructions=model_instructions,
-                    toolbox=Toolbox.of(tools),
-                    memory=conversation_memory,
-                    input=conversation_message,
-                    stream=True,
-                    **extra,
-                )
-
-            else:
-                return await ctx.state(cls).completing(
-                    instructions=model_instructions,
-                    toolbox=Toolbox.of(tools),
-                    memory=conversation_memory,
-                    input=conversation_message,
-                    stream=False,
-                    **extra,
-                )
+            return await ctx.state(cls).completing(
+                instructions=model_instructions,
+                toolbox=Toolbox.of(tools),
+                memory=conversation_memory,
+                input=conversation_message,
+                stream=stream,
+                **extra,
+            )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4ab339c and d26b525.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (27)
  • pyproject.toml (2 hunks)
  • src/draive/cohere/embedding.py (4 hunks)
  • src/draive/conversation/completion/default.py (2 hunks)
  • src/draive/conversation/completion/state.py (2 hunks)
  • src/draive/conversation/realtime/state.py (2 hunks)
  • src/draive/evaluation/scenario.py (2 hunks)
  • src/draive/gemini/embedding.py (2 hunks)
  • src/draive/generation/audio/default.py (1 hunks)
  • src/draive/generation/audio/state.py (2 hunks)
  • src/draive/generation/image/default.py (1 hunks)
  • src/draive/generation/image/state.py (2 hunks)
  • src/draive/generation/model/default.py (1 hunks)
  • src/draive/generation/model/state.py (2 hunks)
  • src/draive/generation/text/default.py (1 hunks)
  • src/draive/generation/text/state.py (2 hunks)
  • src/draive/helpers/volatile_vector_index.py (4 hunks)
  • src/draive/mistral/completions.py (1 hunks)
  • src/draive/mistral/embedding.py (2 hunks)
  • src/draive/models/tools/function.py (4 hunks)
  • src/draive/ollama/embedding.py (2 hunks)
  • src/draive/openai/embedding.py (2 hunks)
  • src/draive/openai/realtime.py (1 hunks)
  • src/draive/postgres/vector_index.py (5 hunks)
  • src/draive/stages/stage.py (6 hunks)
  • src/draive/utils/memory.py (4 hunks)
  • src/draive/utils/vector_index.py (7 hunks)
  • src/draive/vllm/embedding.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (9)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use Python 3.12+ features and syntax across the codebase
Format code exclusively with Ruff (make format); do not use other formatters
Skip module-level docstrings

Files:

  • src/draive/conversation/completion/default.py
  • src/draive/models/tools/function.py
  • src/draive/vllm/embedding.py
  • src/draive/generation/model/default.py
  • src/draive/helpers/volatile_vector_index.py
  • src/draive/generation/text/default.py
  • src/draive/conversation/completion/state.py
  • src/draive/generation/image/default.py
  • src/draive/ollama/embedding.py
  • src/draive/generation/audio/state.py
  • src/draive/conversation/realtime/state.py
  • src/draive/generation/audio/default.py
  • src/draive/utils/vector_index.py
  • src/draive/utils/memory.py
  • src/draive/mistral/embedding.py
  • src/draive/evaluation/scenario.py
  • src/draive/generation/text/state.py
  • src/draive/openai/realtime.py
  • src/draive/openai/embedding.py
  • src/draive/postgres/vector_index.py
  • src/draive/generation/image/state.py
  • src/draive/stages/stage.py
  • src/draive/generation/model/state.py
  • src/draive/gemini/embedding.py
  • src/draive/mistral/completions.py
  • src/draive/cohere/embedding.py
src/draive/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/draive/**/*.py: Import Haiway symbols directly (from haiway import State, ctx)
Use ctx.scope(...) to bind scoped Disposables and active State; avoid global state
Route all logs through ctx.log_debug/info/warn/error; do not use print
Use latest, most strict typing syntax (Python 3.12+), with strict typing only for public APIs
Avoid loose Any except at explicit third‑party boundaries
Prefer explicit attribute access with static types; avoid dynamic getattr except at narrow boundaries
Prefer Mapping/Sequence/Iterable in public types over dict/list/set
Use final where applicable; avoid inheritance and prefer composition
Use precise unions (|) and narrow with match/isinstance; avoid cast unless provably safe and localized
Model immutable data/config and facades with haiway.State; provide ergonomic classmethods like .of(...)
Avoid in-place mutation; use State.updated(...) or functional builders to produce new instances
Access active state via haiway.ctx inside async scopes (ctx.scope(...))
Use @statemethod for public state methods that dispatch on the active instance
Log around generation calls, tool dispatch, and provider requests/responses without leaking secrets; prefer structured/concise messages
Add metrics via ctx.record where applicable
All I/O is async; keep boundaries async and use ctx.spawn for detached tasks
Use structured concurrency and valid coroutine usage; rely on haiway/asyncio; avoid custom threading
Construct multimodal content with MultimodalContent.of(...) and compose blocks explicitly
Use ResourceContent/ResourceReference for media/data blobs
Wrap custom types/data within ArtifactContent; use hidden when needed
Add NumPy-style docstrings for public symbols with Parameters/Returns/Raises and rationale when non-obvious
Avoid docstrings on internal helpers; keep names self-explanatory
Keep docstrings high-quality; mkdocstrings pulls them into API reference
Never log secrets or full request bodies containing keys/tokens

Files:

  • src/draive/conversation/completion/default.py
  • src/draive/models/tools/function.py
  • src/draive/vllm/embedding.py
  • src/draive/generation/model/default.py
  • src/draive/helpers/volatile_vector_index.py
  • src/draive/generation/text/default.py
  • src/draive/conversation/completion/state.py
  • src/draive/generation/image/default.py
  • src/draive/ollama/embedding.py
  • src/draive/generation/audio/state.py
  • src/draive/conversation/realtime/state.py
  • src/draive/generation/audio/default.py
  • src/draive/utils/vector_index.py
  • src/draive/utils/memory.py
  • src/draive/mistral/embedding.py
  • src/draive/evaluation/scenario.py
  • src/draive/generation/text/state.py
  • src/draive/openai/realtime.py
  • src/draive/openai/embedding.py
  • src/draive/postgres/vector_index.py
  • src/draive/generation/image/state.py
  • src/draive/stages/stage.py
  • src/draive/generation/model/state.py
  • src/draive/gemini/embedding.py
  • src/draive/mistral/completions.py
  • src/draive/cohere/embedding.py
src/draive/conversation/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Implement higher-level chat/realtime conversations under draive/conversation/

Files:

  • src/draive/conversation/completion/default.py
  • src/draive/conversation/completion/state.py
  • src/draive/conversation/realtime/state.py
src/draive/models/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Keep core abstractions (GenerativeModel, tools, instructions) under draive/models/

Files:

  • src/draive/models/tools/function.py
src/draive/{openai,anthropic,mistral,gemini,vllm,ollama,bedrock,cohere}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/draive/{openai,anthropic,mistral,gemini,vllm,ollama,bedrock,cohere}/**/*.py: Provider-specific feature modules live under their respective provider directories
Translate provider/SDK errors into typed exceptions; do not raise bare Exception and preserve context
Use environment variables for credentials and resolve via helper functions like getenv_str

Files:

  • src/draive/vllm/embedding.py
  • src/draive/ollama/embedding.py
  • src/draive/mistral/embedding.py
  • src/draive/openai/realtime.py
  • src/draive/openai/embedding.py
  • src/draive/gemini/embedding.py
  • src/draive/mistral/completions.py
  • src/draive/cohere/embedding.py
src/draive/utils/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Keep utilities (e.g., Memory, VectorIndex) under draive/utils/

Files:

  • src/draive/utils/vector_index.py
  • src/draive/utils/memory.py
src/draive/{httpx,mcp,postgres,opentelemetry}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Place integrations under draive/httpx, draive/mcp, draive/postgres, draive/opentelemetry

Files:

  • src/draive/postgres/vector_index.py
src/draive/stages/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Implement pipeline stage abstractions and helpers under draive/stages/

Files:

  • src/draive/stages/stage.py
{pyproject.toml,pyrightconfig.json}

📄 CodeRabbit inference engine (AGENTS.md)

Use Ruff, Bandit, and Pyright (strict) via make lint

Files:

  • pyproject.toml
🧠 Learnings (18)
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Log around generation calls, tool dispatch, and provider requests/responses without leaking secrets; prefer structured/concise messages

Applied to files:

  • src/draive/models/tools/function.py
  • src/draive/generation/model/default.py
  • src/draive/generation/audio/state.py
  • src/draive/generation/text/state.py
  • src/draive/generation/image/state.py
  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Never log secrets or full request bodies containing keys/tokens

Applied to files:

  • src/draive/models/tools/function.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Use structured concurrency and valid coroutine usage; rely on haiway/asyncio; avoid custom threading

Applied to files:

  • src/draive/models/tools/function.py
  • src/draive/conversation/completion/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Access active state via haiway.ctx inside async scopes (ctx.scope(...))

Applied to files:

  • src/draive/models/tools/function.py
  • src/draive/generation/text/default.py
  • src/draive/conversation/completion/state.py
  • src/draive/generation/image/default.py
  • src/draive/generation/audio/state.py
  • src/draive/generation/audio/default.py
  • src/draive/utils/memory.py
  • src/draive/generation/text/state.py
  • src/draive/generation/image/state.py
  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Model immutable data/config and facades with haiway.State; provide ergonomic classmethods like .of(...)

Applied to files:

  • src/draive/models/tools/function.py
  • src/draive/generation/audio/state.py
  • src/draive/utils/memory.py
  • src/draive/generation/text/state.py
  • src/draive/generation/image/state.py
  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/embedding/**/*.py : Keep vector ops, similarity, indexing, and typed embedding states under draive/embedding/

Applied to files:

  • src/draive/vllm/embedding.py
  • src/draive/ollama/embedding.py
  • src/draive/utils/vector_index.py
  • src/draive/openai/embedding.py
  • src/draive/gemini/embedding.py
  • src/draive/cohere/embedding.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/models/**/*.py : Keep core abstractions (GenerativeModel, tools, instructions) under draive/models/

Applied to files:

  • src/draive/generation/model/default.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Use ctx.scope(...) to bind scoped Disposables and active State; avoid global state

Applied to files:

  • src/draive/conversation/completion/state.py
  • src/draive/generation/text/state.py
  • src/draive/generation/image/state.py
  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/conversation/**/*.py : Implement higher-level chat/realtime conversations under draive/conversation/

Applied to files:

  • src/draive/conversation/completion/state.py
  • src/draive/conversation/realtime/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to tests/**/*.py : Prefer scoping with ctx.scope(...) in async tests and bind required State instances explicitly

Applied to files:

  • src/draive/conversation/completion/state.py
  • src/draive/generation/text/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Import Haiway symbols directly (from haiway import State, ctx)

Applied to files:

  • src/draive/conversation/completion/state.py
  • src/draive/generation/image/default.py
  • src/draive/generation/audio/state.py
  • src/draive/generation/audio/default.py
  • src/draive/utils/memory.py
  • src/draive/generation/text/state.py
  • src/draive/generation/image/state.py
  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Use statemethod for public state methods that dispatch on the active instance

Applied to files:

  • src/draive/generation/audio/state.py
  • src/draive/generation/image/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/utils/**/*.py : Keep utilities (e.g., Memory, VectorIndex) under draive/utils/

Applied to files:

  • src/draive/utils/vector_index.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Add metrics via ctx.record where applicable

Applied to files:

  • src/draive/utils/memory.py
  • src/draive/evaluation/scenario.py
  • src/draive/openai/realtime.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/stages/**/*.py : Implement pipeline stage abstractions and helpers under draive/stages/

Applied to files:

  • src/draive/stages/stage.py
📚 Learning: 2025-06-16T10:28:07.434Z
Learnt from: KaQuMiQ
Repo: miquido/draive PR: 338
File: src/draive/lmm/__init__.py:1-2
Timestamp: 2025-06-16T10:28:07.434Z
Learning: The draive project requires Python 3.12+ as specified in pyproject.toml with "requires-python = ">=3.12"" and uses Python 3.12+ specific features like PEP 695 type aliases and generic syntax extensively throughout the codebase.

Applied to files:

  • pyproject.toml
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Avoid in-place mutation; use State.updated(...) or functional builders to produce new instances

Applied to files:

  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Prefer Mapping/Sequence/Iterable in public types over dict/list/set

Applied to files:

  • src/draive/generation/model/state.py
🧬 Code graph analysis (13)
src/draive/conversation/completion/default.py (6)
src/draive/models/types.py (8)
  • ModelMemoryRecall (719-766)
  • ModelInput (429-501)
  • content (474-478)
  • content (618-622)
  • ModelOutput (573-672)
  • content_with_reasoning (625-646)
  • ModelReasoning (518-562)
  • ModelToolRequest (311-353)
src/draive/utils/memory.py (6)
  • recall (70-73)
  • recall (76-79)
  • recall (82-90)
  • remember (94-98)
  • remember (101-105)
  • remember (108-117)
src/draive/postgres/memory.py (2)
  • recall (68-89)
  • remember (91-145)
src/draive/conversation/types.py (6)
  • of (33-52)
  • of (78-89)
  • of (117-130)
  • ConversationMessage (139-200)
  • model (175-190)
  • ConversationOutputChunk (64-97)
src/draive/models/generative.py (6)
  • GenerativeModel (45-517)
  • loop (186-195)
  • loop (198-207)
  • loop (211-220)
  • loop (223-232)
  • loop (235-307)
src/draive/multimodal/artifact.py (1)
  • ArtifactContent (11-96)
src/draive/generation/model/default.py (3)
src/draive/models/types.py (4)
  • ModelOutput (573-672)
  • ModelInput (429-501)
  • content (474-478)
  • content (618-622)
src/draive/models/generative.py (5)
  • loop (186-195)
  • loop (198-207)
  • loop (211-220)
  • loop (223-232)
  • loop (235-307)
src/draive/multimodal/content.py (6)
  • of (41-65)
  • of (618-646)
  • MultimodalContent (24-591)
  • artifacts (207-212)
  • artifacts (215-221)
  • artifacts (223-272)
src/draive/generation/text/default.py (3)
src/draive/models/types.py (1)
  • ModelOutput (573-672)
src/draive/models/generative.py (5)
  • loop (186-195)
  • loop (198-207)
  • loop (211-220)
  • loop (223-232)
  • loop (235-307)
src/draive/multimodal/content.py (5)
  • of (41-65)
  • of (618-646)
  • MultimodalContent (24-591)
  • to_str (287-296)
  • to_str (652-665)
src/draive/conversation/completion/state.py (3)
src/draive/multimodal/templates/repository.py (6)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
src/draive/utils/memory.py (3)
  • recall (70-73)
  • recall (76-79)
  • recall (82-90)
src/draive/multimodal/templates/types.py (2)
  • of (56-84)
  • Template (39-139)
src/draive/generation/image/default.py (3)
src/draive/models/types.py (4)
  • ModelOutput (573-672)
  • ModelInput (429-501)
  • content (474-478)
  • content (618-622)
src/draive/models/generative.py (5)
  • completion (62-71)
  • completion (74-83)
  • completion (87-96)
  • completion (99-108)
  • completion (111-182)
src/draive/multimodal/content.py (3)
  • of (41-65)
  • of (618-646)
  • images (143-158)
src/draive/ollama/embedding.py (1)
src/draive/parameters/model.py (1)
  • to_mapping (462-478)
src/draive/generation/audio/state.py (2)
src/draive/multimodal/templates/types.py (1)
  • Template (39-139)
src/draive/multimodal/templates/repository.py (6)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
src/draive/conversation/realtime/state.py (3)
src/draive/utils/memory.py (5)
  • Memory (52-145)
  • recall (70-73)
  • recall (76-79)
  • recall (82-90)
  • constant (54-66)
src/draive/models/types.py (3)
  • ModelMemoryRecall (719-766)
  • ModelInput (429-501)
  • ModelOutput (573-672)
src/draive/multimodal/templates/repository.py (3)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
src/draive/generation/audio/default.py (3)
src/draive/models/types.py (3)
  • ModelOutput (573-672)
  • content (474-478)
  • content (618-622)
src/draive/models/generative.py (5)
  • completion (62-71)
  • completion (74-83)
  • completion (87-96)
  • completion (99-108)
  • completion (111-182)
src/draive/multimodal/content.py (3)
  • of (41-65)
  • of (618-646)
  • audio (160-175)
src/draive/generation/text/state.py (2)
src/draive/multimodal/templates/types.py (3)
  • Template (39-139)
  • of (56-84)
  • of (162-194)
src/draive/multimodal/templates/repository.py (6)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
src/draive/generation/image/state.py (2)
src/draive/multimodal/templates/types.py (1)
  • Template (39-139)
src/draive/multimodal/templates/repository.py (7)
  • TemplatesRepository (61-435)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
src/draive/stages/stage.py (1)
src/draive/multimodal/templates/types.py (1)
  • Template (39-139)
src/draive/generation/model/state.py (3)
src/draive/multimodal/templates/types.py (3)
  • Template (39-139)
  • of (56-84)
  • of (162-194)
src/draive/multimodal/templates/repository.py (6)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
src/draive/models/tools/toolbox.py (1)
  • Toolbox (20-467)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: test (macos-latest, 3.13)
🔇 Additional comments (38)
src/draive/openai/realtime.py (1)

109-118: LGTM! Consistent observability instrumentation.

The addition of ObservabilityLevel.INFO makes this telemetry call consistent with other ctx.record() calls in the file (lines 223-248) and properly instruments session startup with appropriate metadata.

Based on learnings.

src/draive/mistral/completions.py (1)

184-203: LGTM: Improved telemetry consistency.

The metric renaming from lmm.input_tokens/lmm.output_tokens to model.input_tokens/model.output_tokens and the attribute restructuring from lmm.model to model.provider/model.name improve observability consistency across providers. The provider/name split enables better aggregation and filtering in telemetry systems.

src/draive/models/tools/function.py (2)

276-279: LGTM: INFO record for tool invocation tracking.

The INFO-level record with call_id provides appropriate observability for tracking tool invocations without exposing sensitive data.


300-334: LGTM: Appropriate ERROR records for exception tracking.

The ERROR-level records for both Exception and BaseException cases provide necessary observability for debugging tool failures. The pattern of logging exception type and message is standard practice and appropriately balances observability with security concerns.

src/draive/evaluation/scenario.py (1)

11-11: LGTM: Import addition supports telemetry feature.

The ObservabilityLevel import from haiway is correct and necessary for the telemetry recording added below.

src/draive/generation/audio/state.py (1)

42-61: LGTM! Observability pattern correctly implemented.

The addition of ctx.scope and template identifier logging aligns with the PR's observability strategy. The flow correctly resolves templates within the instrumented scope and maintains the existing generation logic.

src/draive/generation/text/state.py (1)

48-69: LGTM! Observability pattern correctly implemented.

The instrumentation follows the same pattern as other generation modules. Template resolution, toolbox construction, and example transformation are correctly executed within the observed scope.

pyproject.toml (2)

8-8: Version bump aligns with the observability enhancements.

The minor version increment from 0.91.1 to 0.91.2 appropriately reflects the addition of telemetry and error handling improvements throughout the codebase.


27-27: haiway 0.37.6 verified as available and compatible.

Version 0.37.6 is available on PyPI, and no security vulnerabilities were found. The codebase extensively uses both ObservabilityLevel and ctx from haiway across 50+ files for observability and logging, confirming these features are core dependencies that will function with this version.

src/draive/utils/memory.py (3)

86-89: LGTM! Observability level now explicit.

The ctx.record call now includes ObservabilityLevel.INFO, addressing the previous review feedback and ensuring consistency with the PR's observability patterns.


113-116: LGTM! Observability level now explicit.

The ctx.record call for memory.remember correctly includes ObservabilityLevel.INFO, consistent with the recall method.


137-140: LGTM! Observability level now explicit.

The ctx.record call for memory.maintenance correctly includes ObservabilityLevel.INFO, completing the consistent instrumentation across all memory lifecycle methods.

src/draive/generation/model/state.py (1)

61-118: LGTM! Comprehensive observability instrumentation added.

The generation flow now includes:

  • Scope wrapping for the entire model generation
  • Model name tracking via generated.__qualname__
  • Schema injection mode recording
  • Template identifier logging when applicable

The instrumentation aligns with the PR's observability strategy and maintains the existing generation logic.

src/draive/helpers/volatile_vector_index.py (2)

49-49: LGTM! Direct iteration over Collection is correct.

The removal of as_tuple and direct use of the values Collection in the loop (line 49) and zip (line 86) is correct and more efficient, as Collection already supports iteration and maintains order.

Also applies to: 86-86


32-32: Collection type aligns with protocol contract and wrapper requirements—no breaking change.

The concern in the review is based on incomplete analysis. The VectorIndexing Protocol defines the contract with values: Collection[Model], and critically, VectorIndex.index() calls len(values) for observability metrics, which requires Collection. Both VolatileVectorIndex and PostgresVectorIndex implement with values: Collection[Model], making this change consistent across all implementations and with the protocol contract. Generators and arbitrary iterables were never valid per the defined protocol—this is not a breaking change relative to the interface specification, but rather correct alignment with the established contract.

src/draive/vllm/embedding.py (2)

39-39: LGTM! Batch size added to initial context.

Recording embedding.batch_size in the initial context provides useful configuration visibility for debugging and monitoring.


51-79: LGTM! Metrics and early exit correctly implemented.

The additions include:

  • embedding.items metric tracking the count of attributes to embed
  • Early exit optimization when there are no attributes
  • embedding.batches metric with correct ceiling division calculation

The metrics provide valuable observability for embedding operations without altering the core logic.

src/draive/openai/embedding.py (2)

39-39: LGTM! Batch size added to initial context.

Recording embedding.batch_size provides configuration visibility consistent with the VLLM embedding implementation.


52-80: LGTM! Metrics and early exit correctly implemented.

The instrumentation mirrors the VLLM implementation with:

  • embedding.items metric for tracking attribute count
  • Early exit when no attributes need embedding
  • embedding.batches metric with correct ceiling division

This consistent pattern across embedding providers enhances observability uniformly.

src/draive/mistral/embedding.py (1)

37-37: LGTM! Excellent observability instrumentation.

The addition of batch_size context, items/batches metrics, and the early-exit guard for empty inputs improves both observability and performance. The ceiling division for batch calculation is correct, and the metric attributes (provider, model, type) provide good context for monitoring.

Also applies to: 49-77

src/draive/generation/image/state.py (1)

42-61: LGTM! Well-structured observability scope.

The addition of ctx.scope("generate_image") and conditional template identifier logging follows the coding guidelines and provides good observability. Template resolution is correctly performed within the scope, and the logging avoids leaking sensitive content.

Based on learnings.

src/draive/gemini/embedding.py (1)

39-39: LGTM! Consistent observability pattern.

The observability instrumentation mirrors the pattern used across other embedding providers. The batch_size context, items/batches metrics, and early-exit optimization are all correctly implemented and provide consistent monitoring across providers.

Also applies to: 51-79

src/draive/generation/text/default.py (1)

24-43: LGTM! Clean refactoring with preserved behavior.

The removal of the ctx.scope wrapper aligns with the broader refactoring pattern where observability scoping is moved to state modules. The generation logic remains functionally equivalent, with the same context construction and return value.

src/draive/generation/audio/default.py (1)

21-32: LGTM! Consistent refactoring pattern.

The scope removal follows the same pattern as other generation default modules. The audio generation logic, context construction, and error handling remain functionally identical.

src/draive/generation/image/default.py (1)

21-32: LGTM! Refactoring maintains functional equivalence.

The scope removal is consistent with the refactoring pattern applied across generation modules. The image generation call, result processing, and error handling remain unchanged in behavior.

src/draive/generation/model/default.py (1)

31-68: LGTM! Scope refactoring with preserved semantics.

The elevation of GenerativeModel.loop to top-level and the deindentation of the try/except block maintain functional equivalence. The context construction, decoder selection logic (custom → artifact → JSON fallback), and error handling all remain correct.

src/draive/cohere/embedding.py (2)

38-38: LGTM! Text embedding observability enhancements.

The observability instrumentation for text embeddings is consistent with other providers. The batch_size context, items/batches metrics, and early-exit guard are all correctly implemented.

Also applies to: 51-79


133-133: LGTM! Image embedding observability enhancements.

The same observability pattern is correctly applied to the image embedding path, maintaining consistency across both text and image embeddings within the Cohere provider.

Also applies to: 145-173

src/draive/ollama/embedding.py (1)

67-68: LGTM! Early exit for empty input.

The early exit correctly prevents unnecessary API calls when no attributes are provided. Metrics are already recorded before this check, preserving observability.

src/draive/conversation/completion/default.py (1)

89-123: LGTM! Clean inlining of memory lifecycle.

The refactored flow clearly separates memory recall, model invocation, finalization, and remember. Error handling around memory.remember appropriately logs failures before re-raising.

src/draive/stages/stage.py (2)

350-360: LGTM! Consistent template observability instrumentation.

The template observability recording is consistently applied across all completion methods (completion, prompting_completion, loopback_completion, result_completion). The pattern correctly checks for Template instances and records their identifiers.

Also applies to: 459-463, 557-561, 629-633


2104-2108: LGTM! Correct handling of string instructions.

The change to pass instructions directly is appropriate since it's already resolved to a string by this point (line 2084). No need to re-resolve a string value.

src/draive/postgres/vector_index.py (1)

77-77: LGTM! Simplified by removing as_tuple conversion.

The removal of the intermediate as_tuple(values) conversion simplifies the code and aligns with the Collection[Model] type constraint. Direct iteration over values is more efficient and equally correct.

Also applies to: 92-92, 120-120, 137-137

src/draive/utils/vector_index.py (1)

1-1: Collection constraint is necessary and consistently applied throughout the VectorIndex API.

The type change from Iterable[Model] to Collection[Model] is required for the len(values) call in telemetry (line 100). This constraint is:

  • Already enforced in the VectorIndexing Protocol (line 28)
  • Already enforced in both PostgresVectorIndex and VolatileVectorIndex implementations
  • Consistently applied across all overloads of the index method

No internal callers of .index() were found in the codebase, indicating no internal breaking changes. This is a deliberate design decision to support structured observability metrics. External callers passing non-Collection iterables (generators, iterators) will need to convert to a Collection type (e.g., list, tuple), but this appears to be an acceptable constraint for the public API.

src/draive/conversation/realtime/state.py (4)

4-4: LGTM: Import additions support observability instrumentation.

The added BasicValue and ObservabilityLevel imports are correctly used for memory variables typing (line 66) and observability recording (line 89).


62-85: Well-structured scoping and memory handling.

The async context scope and memory branching logic are correctly implemented:

  • ctx.scope("conversation_realtime") properly scopes the preparation flow per coding guidelines.
  • The Memory branch (lines 67-70) correctly recalls and extracts variables.
  • The Iterable branch (lines 72-85) correctly converts ConversationMessage items to ModelInput/ModelOutput elements.
  • The generator function appropriately yields ModelInput.of(message.content) for user messages and ModelOutput.of(message.content) for others.

Regarding the past review comment on type hints (line 42 vs line 67):

The current type hint memory: ModelMemory | Iterable[ConversationMessage] is correct for the public API contract. The runtime check isinstance(memory, Memory) is appropriate because:

  1. ModelMemory is a type alias for Memory[ModelMemoryRecall, ModelContextElement], which is erased at runtime.
  2. Python's runtime type system cannot check generic type parameters.
  3. The type hint correctly specifies what callers should pass; the isinstance check discriminates between Memory and Iterable at runtime.
  4. Line 69 expects ModelMemoryRecall from recall(), confirming the intent is ModelMemory.

Broadening the type hint to Memory would weaken the API contract without providing additional safety.


87-91: LGTM: Observability instrumentation correctly implemented.

The template identifier recording is well-placed and uses the appropriate observability level. This aligns with the coding guidelines for adding metrics via ctx.record.


93-107: LGTM: Template resolution and preparation call correctly structured.

The inline return with template resolution is well-implemented:

  • TemplatesRepository.resolve_str is called with properly constructed arguments.
  • The conditional expression on lines 96-101 safely handles None memory_variables—the dict comprehension is only evaluated when memory_variables is truthy.
  • String conversion (line 97-98) appropriately handles BasicValue types (int, float, etc.) for template formatting.
  • The Toolbox.of(tools) construction and parameter forwarding are correct.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (5)
src/draive/models/tools/function.py (1)

281-292: [Duplicate] Risk: Unfiltered logging of arguments and results may leak secrets or PII.

This concern was previously raised: The debug records unconditionally log all arguments (line 283) and results (line 291) as strings, which risks exposing secrets or PII in development/staging environments. Additionally, line 283 has unnecessary dict unpacking ({**{...}}).

Refer to the previous review comment for suggested mitigations (filtering capability, documentation warning, or removing value content from debug logs).

As per coding guidelines: "Never log secrets or full request bodies containing keys/tokens"

src/draive/conversation/completion/state.py (1)

143-161: Consider consolidating the stream branching.

The stream/non-stream branches duplicate the call to ctx.state(cls).completing with only the stream parameter differing.

Apply this diff to reduce duplication:

-            if stream:
-                return await ctx.state(cls).completing(
-                    instructions=model_instructions,
-                    toolbox=Toolbox.of(tools),
-                    memory=conversation_memory,
-                    input=conversation_message,
-                    stream=True,
-                    **extra,
-                )
-
-            else:
-                return await ctx.state(cls).completing(
-                    instructions=model_instructions,
-                    toolbox=Toolbox.of(tools),
-                    memory=conversation_memory,
-                    input=conversation_message,
-                    stream=False,
-                    **extra,
-                )
+            return await ctx.state(cls).completing(
+                instructions=model_instructions,
+                toolbox=Toolbox.of(tools),
+                memory=conversation_memory,
+                input=conversation_message,
+                stream=stream,
+                **extra,
+            )
src/draive/conversation/completion/default.py (1)

111-120: Consider extracting the memory.remember error handling.

The error handling pattern around memory.remember is duplicated between the streaming and non-streaming paths.

Example refactor:

async def _remember_with_logging(memory: ModelMemory, *context: ModelContextElement) -> None:
    try:
        await memory.remember(*context)
    except Exception as exc:
        ctx.log_error(
            "Failed to remember conversation context",
            exception=exc,
        )
        raise exc

Then use:

await _remember_with_logging(memory, *context[len(memory_recall.context):])

Also applies to: 160-169

src/draive/conversation/realtime/state.py (1)

42-42: Unresolved: Type hint still inconsistent with runtime check.

The parameter type hint memory: ModelMemory | Iterable[ConversationMessage] (line 42) remains inconsistent with the runtime check isinstance(memory, Memory) (line 67), which accepts any Memory instance. This issue was flagged as critical in a previous review but has not been addressed.

Either:

  1. Broaden the type hint to memory: Memory | Iterable[ConversationMessage], or
  2. Narrow the runtime check to validate only ModelMemory instances

The current mismatch undermines type safety at this public API boundary.

As per coding guidelines on strict typing for public APIs.

Also applies to: 67-70

src/draive/generation/model/state.py (1)

67-91: Consider simplifying schema_injection recording.

The repeated ctx.record calls in each match branch were already flagged in a previous review. Consolidating to a single record after the match would reduce duplication.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d26b525 and beb2e5d.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (29)
  • pyproject.toml (2 hunks)
  • src/draive/cohere/embedding.py (4 hunks)
  • src/draive/conversation/completion/default.py (2 hunks)
  • src/draive/conversation/completion/state.py (2 hunks)
  • src/draive/conversation/realtime/state.py (2 hunks)
  • src/draive/evaluation/scenario.py (2 hunks)
  • src/draive/gemini/embedding.py (2 hunks)
  • src/draive/generation/audio/default.py (1 hunks)
  • src/draive/generation/audio/state.py (2 hunks)
  • src/draive/generation/image/default.py (1 hunks)
  • src/draive/generation/image/state.py (2 hunks)
  • src/draive/generation/model/default.py (1 hunks)
  • src/draive/generation/model/state.py (2 hunks)
  • src/draive/generation/text/default.py (1 hunks)
  • src/draive/generation/text/state.py (2 hunks)
  • src/draive/helpers/volatile_vector_index.py (4 hunks)
  • src/draive/mistral/completions.py (2 hunks)
  • src/draive/mistral/embedding.py (2 hunks)
  • src/draive/models/tools/function.py (4 hunks)
  • src/draive/ollama/embedding.py (2 hunks)
  • src/draive/openai/embedding.py (2 hunks)
  • src/draive/openai/realtime.py (1 hunks)
  • src/draive/postgres/templates.py (1 hunks)
  • src/draive/postgres/vector_index.py (5 hunks)
  • src/draive/stages/stage.py (6 hunks)
  • src/draive/utils/memory.py (4 hunks)
  • src/draive/utils/vector_index.py (7 hunks)
  • src/draive/vllm/embedding.py (2 hunks)
  • src/draive/vllm/messages.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (9)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use Python 3.12+ features and syntax across the codebase
Format code exclusively with Ruff (make format); do not use other formatters
Skip module-level docstrings

Files:

  • src/draive/cohere/embedding.py
  • src/draive/evaluation/scenario.py
  • src/draive/openai/realtime.py
  • src/draive/generation/image/default.py
  • src/draive/openai/embedding.py
  • src/draive/mistral/embedding.py
  • src/draive/gemini/embedding.py
  • src/draive/stages/stage.py
  • src/draive/vllm/embedding.py
  • src/draive/conversation/completion/state.py
  • src/draive/generation/text/state.py
  • src/draive/generation/audio/default.py
  • src/draive/generation/text/default.py
  • src/draive/generation/audio/state.py
  • src/draive/helpers/volatile_vector_index.py
  • src/draive/utils/memory.py
  • src/draive/generation/model/default.py
  • src/draive/postgres/vector_index.py
  • src/draive/vllm/messages.py
  • src/draive/ollama/embedding.py
  • src/draive/utils/vector_index.py
  • src/draive/conversation/realtime/state.py
  • src/draive/generation/image/state.py
  • src/draive/postgres/templates.py
  • src/draive/models/tools/function.py
  • src/draive/generation/model/state.py
  • src/draive/mistral/completions.py
  • src/draive/conversation/completion/default.py
src/draive/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/draive/**/*.py: Import Haiway symbols directly (from haiway import State, ctx)
Use ctx.scope(...) to bind scoped Disposables and active State; avoid global state
Route all logs through ctx.log_debug/info/warn/error; do not use print
Use latest, most strict typing syntax (Python 3.12+), with strict typing only for public APIs
Avoid loose Any except at explicit third‑party boundaries
Prefer explicit attribute access with static types; avoid dynamic getattr except at narrow boundaries
Prefer Mapping/Sequence/Iterable in public types over dict/list/set
Use final where applicable; avoid inheritance and prefer composition
Use precise unions (|) and narrow with match/isinstance; avoid cast unless provably safe and localized
Model immutable data/config and facades with haiway.State; provide ergonomic classmethods like .of(...)
Avoid in-place mutation; use State.updated(...) or functional builders to produce new instances
Access active state via haiway.ctx inside async scopes (ctx.scope(...))
Use @statemethod for public state methods that dispatch on the active instance
Log around generation calls, tool dispatch, and provider requests/responses without leaking secrets; prefer structured/concise messages
Add metrics via ctx.record where applicable
All I/O is async; keep boundaries async and use ctx.spawn for detached tasks
Use structured concurrency and valid coroutine usage; rely on haiway/asyncio; avoid custom threading
Construct multimodal content with MultimodalContent.of(...) and compose blocks explicitly
Use ResourceContent/ResourceReference for media/data blobs
Wrap custom types/data within ArtifactContent; use hidden when needed
Add NumPy-style docstrings for public symbols with Parameters/Returns/Raises and rationale when non-obvious
Avoid docstrings on internal helpers; keep names self-explanatory
Keep docstrings high-quality; mkdocstrings pulls them into API reference
Never log secrets or full request bodies containing keys/tokens

Files:

  • src/draive/cohere/embedding.py
  • src/draive/evaluation/scenario.py
  • src/draive/openai/realtime.py
  • src/draive/generation/image/default.py
  • src/draive/openai/embedding.py
  • src/draive/mistral/embedding.py
  • src/draive/gemini/embedding.py
  • src/draive/stages/stage.py
  • src/draive/vllm/embedding.py
  • src/draive/conversation/completion/state.py
  • src/draive/generation/text/state.py
  • src/draive/generation/audio/default.py
  • src/draive/generation/text/default.py
  • src/draive/generation/audio/state.py
  • src/draive/helpers/volatile_vector_index.py
  • src/draive/utils/memory.py
  • src/draive/generation/model/default.py
  • src/draive/postgres/vector_index.py
  • src/draive/vllm/messages.py
  • src/draive/ollama/embedding.py
  • src/draive/utils/vector_index.py
  • src/draive/conversation/realtime/state.py
  • src/draive/generation/image/state.py
  • src/draive/postgres/templates.py
  • src/draive/models/tools/function.py
  • src/draive/generation/model/state.py
  • src/draive/mistral/completions.py
  • src/draive/conversation/completion/default.py
src/draive/{openai,anthropic,mistral,gemini,vllm,ollama,bedrock,cohere}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/draive/{openai,anthropic,mistral,gemini,vllm,ollama,bedrock,cohere}/**/*.py: Provider-specific feature modules live under their respective provider directories
Translate provider/SDK errors into typed exceptions; do not raise bare Exception and preserve context
Use environment variables for credentials and resolve via helper functions like getenv_str

Files:

  • src/draive/cohere/embedding.py
  • src/draive/openai/realtime.py
  • src/draive/openai/embedding.py
  • src/draive/mistral/embedding.py
  • src/draive/gemini/embedding.py
  • src/draive/vllm/embedding.py
  • src/draive/vllm/messages.py
  • src/draive/ollama/embedding.py
  • src/draive/mistral/completions.py
src/draive/stages/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Implement pipeline stage abstractions and helpers under draive/stages/

Files:

  • src/draive/stages/stage.py
src/draive/conversation/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Implement higher-level chat/realtime conversations under draive/conversation/

Files:

  • src/draive/conversation/completion/state.py
  • src/draive/conversation/realtime/state.py
  • src/draive/conversation/completion/default.py
{pyproject.toml,pyrightconfig.json}

📄 CodeRabbit inference engine (AGENTS.md)

Use Ruff, Bandit, and Pyright (strict) via make lint

Files:

  • pyproject.toml
src/draive/utils/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Keep utilities (e.g., Memory, VectorIndex) under draive/utils/

Files:

  • src/draive/utils/memory.py
  • src/draive/utils/vector_index.py
src/draive/{httpx,mcp,postgres,opentelemetry}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Place integrations under draive/httpx, draive/mcp, draive/postgres, draive/opentelemetry

Files:

  • src/draive/postgres/vector_index.py
  • src/draive/postgres/templates.py
src/draive/models/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Keep core abstractions (GenerativeModel, tools, instructions) under draive/models/

Files:

  • src/draive/models/tools/function.py
🧠 Learnings (20)
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/embedding/**/*.py : Keep vector ops, similarity, indexing, and typed embedding states under draive/embedding/

Applied to files:

  • src/draive/cohere/embedding.py
  • src/draive/openai/embedding.py
  • src/draive/mistral/embedding.py
  • src/draive/gemini/embedding.py
  • src/draive/vllm/embedding.py
  • src/draive/utils/vector_index.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Add metrics via ctx.record where applicable

Applied to files:

  • src/draive/evaluation/scenario.py
  • src/draive/openai/realtime.py
  • src/draive/utils/memory.py
  • src/draive/ollama/embedding.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/__init__.py : Update src/draive/__init__.py exports when API surface changes

Applied to files:

  • src/draive/generation/image/default.py
  • src/draive/utils/vector_index.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Import Haiway symbols directly (from haiway import State, ctx)

Applied to files:

  • src/draive/generation/image/default.py
  • src/draive/conversation/completion/state.py
  • src/draive/generation/text/state.py
  • src/draive/generation/audio/default.py
  • src/draive/generation/audio/state.py
  • src/draive/utils/memory.py
  • src/draive/generation/image/state.py
  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Access active state via haiway.ctx inside async scopes (ctx.scope(...))

Applied to files:

  • src/draive/generation/image/default.py
  • src/draive/conversation/completion/state.py
  • src/draive/generation/text/state.py
  • src/draive/generation/audio/default.py
  • src/draive/generation/audio/state.py
  • src/draive/utils/memory.py
  • src/draive/generation/image/state.py
  • src/draive/models/tools/function.py
  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/stages/**/*.py : Implement pipeline stage abstractions and helpers under draive/stages/

Applied to files:

  • src/draive/stages/stage.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/conversation/**/*.py : Implement higher-level chat/realtime conversations under draive/conversation/

Applied to files:

  • src/draive/conversation/completion/state.py
  • src/draive/conversation/realtime/state.py
  • src/draive/conversation/completion/default.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Use structured concurrency and valid coroutine usage; rely on haiway/asyncio; avoid custom threading

Applied to files:

  • src/draive/conversation/completion/state.py
  • src/draive/models/tools/function.py
📚 Learning: 2025-06-16T10:28:07.434Z
Learnt from: KaQuMiQ
Repo: miquido/draive PR: 338
File: src/draive/lmm/__init__.py:1-2
Timestamp: 2025-06-16T10:28:07.434Z
Learning: The draive project requires Python 3.12+ as specified in pyproject.toml with "requires-python = ">=3.12"" and uses Python 3.12+ specific features like PEP 695 type aliases and generic syntax extensively throughout the codebase.

Applied to files:

  • pyproject.toml
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Log around generation calls, tool dispatch, and provider requests/responses without leaking secrets; prefer structured/concise messages

Applied to files:

  • src/draive/generation/text/state.py
  • src/draive/generation/audio/state.py
  • src/draive/generation/model/default.py
  • src/draive/vllm/messages.py
  • src/draive/generation/image/state.py
  • src/draive/models/tools/function.py
  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Use ctx.scope(...) to bind scoped Disposables and active State; avoid global state

Applied to files:

  • src/draive/generation/text/state.py
  • src/draive/generation/image/state.py
  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Model immutable data/config and facades with haiway.State; provide ergonomic classmethods like .of(...)

Applied to files:

  • src/draive/generation/text/state.py
  • src/draive/generation/audio/state.py
  • src/draive/utils/memory.py
  • src/draive/generation/image/state.py
  • src/draive/models/tools/function.py
  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Use statemethod for public state methods that dispatch on the active instance

Applied to files:

  • src/draive/generation/audio/state.py
  • src/draive/generation/image/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/models/**/*.py : Keep core abstractions (GenerativeModel, tools, instructions) under draive/models/

Applied to files:

  • src/draive/generation/model/default.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Construct multimodal content with MultimodalContent.of(...) and compose blocks explicitly

Applied to files:

  • src/draive/vllm/messages.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/utils/**/*.py : Keep utilities (e.g., Memory, VectorIndex) under draive/utils/

Applied to files:

  • src/draive/utils/vector_index.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to tests/**/*.py : Prefer scoping with ctx.scope(...) in async tests and bind required State instances explicitly

Applied to files:

  • src/draive/generation/image/state.py
  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Never log secrets or full request bodies containing keys/tokens

Applied to files:

  • src/draive/models/tools/function.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Avoid in-place mutation; use State.updated(...) or functional builders to produce new instances

Applied to files:

  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Prefer Mapping/Sequence/Iterable in public types over dict/list/set

Applied to files:

  • src/draive/generation/model/state.py
🔇 Additional comments (45)
src/draive/evaluation/scenario.py (2)

11-11: LGTM: Import addition for observability enhancement.

The ObservabilityLevel import is correctly added and necessary for the telemetry enhancement below.


402-412: Verify EvaluatorResult.evaluator attribute exists.

Line 410 accesses the evaluator attribute on EvaluatorResult objects within a list comprehension. This will raise an AttributeError at runtime if EvaluatorResult does not have an evaluator attribute.

Additionally, the ctx.record call is placed outside the ctx.scope block (which ends at line 400). While this may be intentional to record metrics at the parent scope level, verify this is the desired behavior.

src/draive/vllm/messages.py (2)

528-528: LGTM! Removed unnecessary dict() wrapper.

The json.dumps() function can serialize Mapping types directly, making the dict() wrapper redundant. This simplification is correct and improves code clarity.


536-591: Verify that all MultimodalContentPart subtypes are handled.

The function now handles TextContent, ResourceReference, ResourceContent, and ArtifactContent explicitly. With no fallback case, any unmatched content types will be silently skipped. Please verify that these four cases cover all possible MultimodalContentPart subtypes, or confirm that silently skipping unknown types is the intended behavior.

Run the following script to find all MultimodalContentPart subtypes in the codebase:

src/draive/mistral/completions.py (2)

184-203: LGTM! Improved metric naming and observability granularity.

The changes standardize metric names from lmm.* to model.* and split the model attribute into separate model.provider and model.name attributes. This provides better consistency across the codebase and improves observability by allowing filtering/aggregation by provider and model independently.


442-442: Verify that block.arguments is always JSON-serializable.

The removal of the dict() wrapper simplifies the code. However, ensure that ModelToolRequest.arguments is always a plain dict or a type that json.dumps() can serialize directly. If it could be a custom Mapping type, the dict() conversion may have been necessary.

Run the following script to check the type definition of ModelToolRequest.arguments:

src/draive/openai/realtime.py (1)

109-118: LGTM: Observability level added to telemetry record.

Adding ObservabilityLevel.INFO makes the logging level explicit and aligns with the broader observability instrumentation pattern across the codebase.

Based on learnings

src/draive/generation/audio/state.py (1)

42-61: LGTM: Observability scope and template tracking added.

The scope wrapper and template identifier logging provide valuable telemetry for audio generation flows. This pattern is consistent with similar changes in text and image generation modules.

Based on learnings

src/draive/generation/text/state.py (1)

48-69: LGTM: Observability scope and template tracking added.

The scope wrapper and template identifier logging provide valuable telemetry for text generation flows, mirroring the pattern in audio and image generation modules.

Based on learnings

src/draive/vllm/embedding.py (2)

39-39: LGTM: Batch size added to telemetry context.

Including embedding.batch_size in the initial observability record provides useful context for understanding embedding behavior.


51-79: LGTM: Embedding metrics and early-exit guard added.

The new metrics (embedding.items and embedding.batches) provide valuable observability into embedding workload. The early exit for empty attributes (lines 64-65) prevents unnecessary processing and API calls. This pattern is consistent across all embedding implementations in this PR.

Based on learnings

src/draive/openai/embedding.py (2)

39-42: LGTM: Batch size added to telemetry context.

Including embedding.batch_size in the initial observability record provides useful context for understanding embedding behavior, consistent with other providers.


52-80: LGTM: Embedding metrics and early-exit guard added.

The new metrics (embedding.items and embedding.batches) provide valuable observability into embedding workload. The early exit for empty attributes (lines 65-66) prevents unnecessary processing and API calls. This pattern is consistent across all embedding implementations in this PR.

Based on learnings

src/draive/generation/audio/default.py (1)

21-32: LGTM: Scope management moved to state layer.

The removal of the scope wrapper here is complementary to the scope addition in src/draive/generation/audio/state.py (lines 42-61). This refactoring centralizes observability instrumentation at the state layer, consistent with similar changes in image and text generation modules.

src/draive/mistral/embedding.py (2)

37-39: LGTM: Batch size added to telemetry context.

Including embedding.batch_size in the initial observability record provides useful context for understanding embedding behavior, maintaining consistency across all embedding providers.


49-77: LGTM: Embedding metrics and early-exit guard added.

The new metrics (embedding.items and embedding.batches) provide valuable observability into embedding workload. The early exit for empty attributes (lines 62-63) prevents unnecessary processing and API calls. This pattern is consistent across all embedding implementations in this PR.

Based on learnings

src/draive/generation/text/default.py (1)

24-43: Verify: Scope removal contradicts PR's observability improvements.

This change removes the ctx.scope("generate_text") wrapper, which reduces observability and traceability for this function. This contradicts the broader PR pattern where most modules add ObservabilityLevel instrumentation and scope-based logging (e.g., src/draive/gemini/embedding.py lines 32-41, src/draive/stages/stage.py lines 349-361, src/draive/utils/memory.py lines 86-89).

Was this removal intentional? If observability isn't needed here, please clarify the rationale. Otherwise, consider restoring the scope wrapper to maintain consistency with the PR's instrumentation goals.

src/draive/gemini/embedding.py (1)

39-79: LGTM! Observability instrumentation follows consistent patterns.

The additions of embedding.batch_size, embedding.items, and embedding.batches metrics provide valuable telemetry. The early-exit guard (lines 64-65) is a good optimization that avoids unnecessary API calls when the input is empty. This aligns with the project-wide observability enhancements.

pyproject.toml (1)

8-8: LGTM! Version bumps support new observability features.

The project version bump to 0.91.2 and the haiway dependency upgrade to 0.37.6 enable the ObservabilityLevel and ctx.record enhancements introduced throughout this PR.

Also applies to: 27-27

src/draive/cohere/embedding.py (2)

38-79: LGTM! Text embedding instrumentation is consistent.

The observability additions for text embeddings mirror the pattern used in other providers (e.g., src/draive/gemini/embedding.py). The metrics, early-exit guard, and attribute structure are appropriate.


133-173: LGTM! Image embedding instrumentation mirrors text embedding.

The observability additions for image embeddings follow the same pattern as text embeddings, with appropriate type="image" attributes. The implementation is consistent across both embedding types.

src/draive/utils/memory.py (1)

86-89: LGTM! Memory lifecycle instrumentation is complete and consistent.

The ObservabilityLevel.INFO instrumentation for memory.recall, memory.remember, and memory.maintenance provides appropriate telemetry for memory operations. This aligns with the PR's observability enhancements.

Also applies to: 113-116, 137-140

src/draive/ollama/embedding.py (1)

37-82: LGTM! Ollama embedding instrumentation provides appropriate detail.

The observability additions include both INFO-level key metrics (batch_size, concurrent) and a DEBUG-level full configuration dump via to_mapping(). The separation of logging levels is appropriate, and the pattern is consistent with other embedding providers while providing Ollama-specific details.

src/draive/stages/stage.py (5)

350-361: LGTM! Template instrumentation improves traceability.

The observability additions for instructions.template and input.template provide valuable traceability for template usage in the completion stage. The pattern is consistent across all completion methods.


459-463: LGTM! Consistent template instrumentation in prompting_completion.

The template logging follows the same pattern as the completion method, maintaining consistency across the codebase.


557-561: LGTM! Consistent template instrumentation in loopback_completion.

The template logging follows the same pattern, maintaining consistency.


629-633: LGTM! Consistent template instrumentation in result_completion.

The template logging follows the same pattern, maintaining consistency.


2105-2105: LGTM! Simplified routing by removing unnecessary template resolution.

Since instructions is constructed as a plain string (lines 2084-2090), the removal of await TemplatesRepository.resolve_str() is appropriate. Passing instructions directly to GenerativeModel.loop simplifies the code without changing functionality.

src/draive/helpers/volatile_vector_index.py (2)

49-49: LGTM!

The removal of the intermediate as_tuple conversion and direct iteration over values is cleaner and more efficient. Since Collection is already materialized, this change is functionally equivalent and improves readability.

Also applies to: 86-86


3-3: Verify Collection constraint doesn't break existing callers.

The parameter type changed from Iterable[Model] to Collection[Model], which is more restrictive (requires __len__ and __contains__). This means callers can no longer pass generators or lazy iterators.

Run the following script to verify all callers can satisfy the Collection constraint:

Also applies to: 6-6, 32-32

src/draive/utils/vector_index.py (1)

95-102: LGTM!

The observability instrumentation is well-implemented with consistent patterns across all VectorIndex operations. The telemetry attributes are informative (model name, batch size, query/requirements presence) and use appropriate observability levels.

Also applies to: 172-180, 220-227

src/draive/postgres/vector_index.py (1)

3-3: LGTM!

The changes consistently apply the Collection-based pattern and remove as_tuple usage, aligning with the other VectorIndex implementations. The direct iteration over values is cleaner and more efficient.

Also applies to: 7-7, 77-77, 92-92, 120-120, 137-137

src/draive/generation/image/state.py (1)

42-61: LGTM!

The observability scope integration is well-implemented. Template tracking with ctx.record provides valuable telemetry for debugging, and the scope ensures all generation operations are properly traced.

src/draive/generation/image/default.py (1)

21-32: LGTM!

The removal of the local scope is correct—observability is now handled at the state layer (src/draive/generation/image/state.py line 42), providing better separation of concerns.

src/draive/conversation/completion/state.py (1)

91-141: LGTM!

The observability scope integration is well-implemented, with proper template tracking and memory normalization. The scope ensures all conversation completion operations are properly traced.

src/draive/generation/model/default.py (1)

31-68: LGTM!

The removal of the local scope follows the established pattern of handling observability at the state layer. The context construction and error handling logic remain functionally equivalent.

src/draive/conversation/completion/default.py (2)

89-123: LGTM!

The inlined memory recall and context construction is more direct and easier to follow. The error handling around memory.remember correctly propagates exceptions after logging.


133-171: LGTM!

The streaming path correctly inlines memory recall and context construction. The reasoning artifact wrapping (lines 146-154) properly differentiates reasoning chunks from regular content.

src/draive/generation/model/state.py (3)

4-4: LGTM: Proper haiway imports and scoping.

The direct imports from haiway and the use of ctx.scope("generate_model") align with the coding guidelines for managing state and observability.

Based on learnings and coding guidelines.

Also applies to: 61-61


62-65: LGTM: Well-structured observability records.

The observability records capture relevant metadata (model name, template identifiers) at the appropriate INFO level without leaking sensitive information.

Based on learnings.

Also applies to: 93-103


105-118: LGTM: Correct template resolution and delegation.

The template resolution is properly awaited, and the delegation to self.generating with the appropriate transformations (Toolbox.of, MultimodalContent.of, generator expression) is correct.

src/draive/conversation/realtime/state.py (4)

4-4: LGTM: Proper observability import and state scoping.

The ObservabilityLevel import and the use of ctx.scope with ctx.state follow the coding guidelines for managing scoped state.

Based on learnings and coding guidelines.

Also applies to: 62-63


65-86: LGTM: Memory handling logic is sound.

The dual-path memory handling correctly processes both Memory instances (via recall) and message iterables (via context element generation). The role-based message splitting and the use of ModelMemory.constant are appropriate.


87-91: LGTM: Template observability is consistent.

Recording the template identifier at INFO level provides useful tracing without exposing sensitive data.


93-107: LGTM: Correct template resolution with proper None handling.

The template resolution correctly handles the conditional memory_variables: when None, the entire dict comprehension is skipped via the if memory_variables else None guard, avoiding an AttributeError. The string conversion for non-string values is appropriate for template arguments.

@KaQuMiQ KaQuMiQ force-pushed the feature/metrics branch 2 times, most recently from 46218ed to e086ee2 Compare November 4, 2025 15:10
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (5)
src/draive/postgres/templates.py (1)

160-160: LGTM: Clean removal of redundant conversion.

The direct use of json.dumps(variables) is correct since json.dumps() natively accepts Mapping[str, str] types. This change was already reviewed and approved in a previous commit.

src/draive/models/tools/function.py (1)

281-292: Avoid logging raw tool arguments/results.

Stringifying every argument and the returned payload leaks secrets/PII in debug deployments and reintroduces the exact risk previously flagged. Please only emit sanitized metadata (e.g., argument names/counts, result type) or drop these records entirely. As per coding guidelines.

-            if __debug__:
-                ctx.record(
-                    ObservabilityLevel.DEBUG,
-                    attributes={**{key: f"{arg}" for key, arg in arguments.items()}},
-                )
+            if __debug__ and arguments:
+                ctx.record(
+                    ObservabilityLevel.DEBUG,
+                    attributes={
+                        "arguments.count": len(arguments),
+                        "arguments.names": ",".join(arguments.keys()),
+                    },
+                )
@@
-                if __debug__:
-                    ctx.record(
-                        ObservabilityLevel.DEBUG,
-                        attributes={"result": format_str(result)},
-                    )
+                if __debug__:
+                    ctx.record(
+                        ObservabilityLevel.DEBUG,
+                        attributes={"result.type": type(result).__qualname__},
+                    )
src/draive/conversation/completion/state.py (1)

143-161: Consider the past review suggestion about consolidating stream branching.

A previous review noted that the stream/non-stream branches duplicate the call to ctx.state(cls).completing with only the stream parameter differing. While the current implementation is correct, consolidating these branches would reduce duplication.

src/draive/conversation/completion/default.py (1)

112-120: Past review suggestion about extracting memory.remember error handling remains valid.

The error handling pattern around memory.remember is duplicated between streaming and non-streaming paths. While localized, extracting this into a helper function could reduce duplication as previously suggested.

Also applies to: 160-169

src/draive/conversation/realtime/state.py (1)

42-42: Type hint narrower than runtime check.

The parameter type hint declares memory: ModelMemory | Iterable[ConversationMessage] (line 42), but the runtime check uses isinstance(memory, Memory) (line 67), which accepts any Memory instance. Since ModelMemory is a type alias for Memory[ModelMemoryRecall, ModelContextElement], the type hint should be Memory | Iterable[ConversationMessage] to accurately reflect the runtime behavior.

Apply this diff to align the type hint with the runtime check:

-        memory: ModelMemory | Iterable[ConversationMessage] = (),
+        memory: Memory | Iterable[ConversationMessage] = (),

Also applies to: 67-67

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between beb2e5d and e086ee2.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (29)
  • pyproject.toml (2 hunks)
  • src/draive/cohere/embedding.py (4 hunks)
  • src/draive/conversation/completion/default.py (2 hunks)
  • src/draive/conversation/completion/state.py (2 hunks)
  • src/draive/conversation/realtime/state.py (2 hunks)
  • src/draive/evaluation/scenario.py (2 hunks)
  • src/draive/gemini/embedding.py (2 hunks)
  • src/draive/generation/audio/default.py (1 hunks)
  • src/draive/generation/audio/state.py (2 hunks)
  • src/draive/generation/image/default.py (1 hunks)
  • src/draive/generation/image/state.py (2 hunks)
  • src/draive/generation/model/default.py (1 hunks)
  • src/draive/generation/model/state.py (2 hunks)
  • src/draive/generation/text/default.py (1 hunks)
  • src/draive/generation/text/state.py (2 hunks)
  • src/draive/helpers/volatile_vector_index.py (4 hunks)
  • src/draive/mistral/completions.py (2 hunks)
  • src/draive/mistral/embedding.py (2 hunks)
  • src/draive/models/tools/function.py (4 hunks)
  • src/draive/ollama/embedding.py (2 hunks)
  • src/draive/openai/embedding.py (2 hunks)
  • src/draive/openai/realtime.py (1 hunks)
  • src/draive/postgres/templates.py (1 hunks)
  • src/draive/postgres/vector_index.py (5 hunks)
  • src/draive/stages/stage.py (6 hunks)
  • src/draive/utils/memory.py (4 hunks)
  • src/draive/utils/vector_index.py (7 hunks)
  • src/draive/vllm/embedding.py (2 hunks)
  • src/draive/vllm/messages.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (9)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use Python 3.12+ features and syntax across the codebase
Format code exclusively with Ruff (make format); do not use other formatters
Skip module-level docstrings

Files:

  • src/draive/stages/stage.py
  • src/draive/generation/model/default.py
  • src/draive/generation/audio/state.py
  • src/draive/conversation/completion/state.py
  • src/draive/conversation/realtime/state.py
  • src/draive/generation/image/default.py
  • src/draive/generation/image/state.py
  • src/draive/openai/embedding.py
  • src/draive/vllm/messages.py
  • src/draive/cohere/embedding.py
  • src/draive/vllm/embedding.py
  • src/draive/utils/vector_index.py
  • src/draive/openai/realtime.py
  • src/draive/mistral/completions.py
  • src/draive/gemini/embedding.py
  • src/draive/generation/text/default.py
  • src/draive/conversation/completion/default.py
  • src/draive/generation/model/state.py
  • src/draive/generation/text/state.py
  • src/draive/helpers/volatile_vector_index.py
  • src/draive/postgres/templates.py
  • src/draive/utils/memory.py
  • src/draive/postgres/vector_index.py
  • src/draive/models/tools/function.py
  • src/draive/ollama/embedding.py
  • src/draive/mistral/embedding.py
  • src/draive/evaluation/scenario.py
  • src/draive/generation/audio/default.py
src/draive/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/draive/**/*.py: Import Haiway symbols directly (from haiway import State, ctx)
Use ctx.scope(...) to bind scoped Disposables and active State; avoid global state
Route all logs through ctx.log_debug/info/warn/error; do not use print
Use latest, most strict typing syntax (Python 3.12+), with strict typing only for public APIs
Avoid loose Any except at explicit third‑party boundaries
Prefer explicit attribute access with static types; avoid dynamic getattr except at narrow boundaries
Prefer Mapping/Sequence/Iterable in public types over dict/list/set
Use final where applicable; avoid inheritance and prefer composition
Use precise unions (|) and narrow with match/isinstance; avoid cast unless provably safe and localized
Model immutable data/config and facades with haiway.State; provide ergonomic classmethods like .of(...)
Avoid in-place mutation; use State.updated(...) or functional builders to produce new instances
Access active state via haiway.ctx inside async scopes (ctx.scope(...))
Use @statemethod for public state methods that dispatch on the active instance
Log around generation calls, tool dispatch, and provider requests/responses without leaking secrets; prefer structured/concise messages
Add metrics via ctx.record where applicable
All I/O is async; keep boundaries async and use ctx.spawn for detached tasks
Use structured concurrency and valid coroutine usage; rely on haiway/asyncio; avoid custom threading
Construct multimodal content with MultimodalContent.of(...) and compose blocks explicitly
Use ResourceContent/ResourceReference for media/data blobs
Wrap custom types/data within ArtifactContent; use hidden when needed
Add NumPy-style docstrings for public symbols with Parameters/Returns/Raises and rationale when non-obvious
Avoid docstrings on internal helpers; keep names self-explanatory
Keep docstrings high-quality; mkdocstrings pulls them into API reference
Never log secrets or full request bodies containing keys/tokens

Files:

  • src/draive/stages/stage.py
  • src/draive/generation/model/default.py
  • src/draive/generation/audio/state.py
  • src/draive/conversation/completion/state.py
  • src/draive/conversation/realtime/state.py
  • src/draive/generation/image/default.py
  • src/draive/generation/image/state.py
  • src/draive/openai/embedding.py
  • src/draive/vllm/messages.py
  • src/draive/cohere/embedding.py
  • src/draive/vllm/embedding.py
  • src/draive/utils/vector_index.py
  • src/draive/openai/realtime.py
  • src/draive/mistral/completions.py
  • src/draive/gemini/embedding.py
  • src/draive/generation/text/default.py
  • src/draive/conversation/completion/default.py
  • src/draive/generation/model/state.py
  • src/draive/generation/text/state.py
  • src/draive/helpers/volatile_vector_index.py
  • src/draive/postgres/templates.py
  • src/draive/utils/memory.py
  • src/draive/postgres/vector_index.py
  • src/draive/models/tools/function.py
  • src/draive/ollama/embedding.py
  • src/draive/mistral/embedding.py
  • src/draive/evaluation/scenario.py
  • src/draive/generation/audio/default.py
src/draive/stages/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Implement pipeline stage abstractions and helpers under draive/stages/

Files:

  • src/draive/stages/stage.py
src/draive/conversation/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Implement higher-level chat/realtime conversations under draive/conversation/

Files:

  • src/draive/conversation/completion/state.py
  • src/draive/conversation/realtime/state.py
  • src/draive/conversation/completion/default.py
src/draive/{openai,anthropic,mistral,gemini,vllm,ollama,bedrock,cohere}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/draive/{openai,anthropic,mistral,gemini,vllm,ollama,bedrock,cohere}/**/*.py: Provider-specific feature modules live under their respective provider directories
Translate provider/SDK errors into typed exceptions; do not raise bare Exception and preserve context
Use environment variables for credentials and resolve via helper functions like getenv_str

Files:

  • src/draive/openai/embedding.py
  • src/draive/vllm/messages.py
  • src/draive/cohere/embedding.py
  • src/draive/vllm/embedding.py
  • src/draive/openai/realtime.py
  • src/draive/mistral/completions.py
  • src/draive/gemini/embedding.py
  • src/draive/ollama/embedding.py
  • src/draive/mistral/embedding.py
src/draive/utils/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Keep utilities (e.g., Memory, VectorIndex) under draive/utils/

Files:

  • src/draive/utils/vector_index.py
  • src/draive/utils/memory.py
src/draive/{httpx,mcp,postgres,opentelemetry}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Place integrations under draive/httpx, draive/mcp, draive/postgres, draive/opentelemetry

Files:

  • src/draive/postgres/templates.py
  • src/draive/postgres/vector_index.py
src/draive/models/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Keep core abstractions (GenerativeModel, tools, instructions) under draive/models/

Files:

  • src/draive/models/tools/function.py
{pyproject.toml,pyrightconfig.json}

📄 CodeRabbit inference engine (AGENTS.md)

Use Ruff, Bandit, and Pyright (strict) via make lint

Files:

  • pyproject.toml
🧠 Learnings (20)
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/stages/**/*.py : Implement pipeline stage abstractions and helpers under draive/stages/

Applied to files:

  • src/draive/stages/stage.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/models/**/*.py : Keep core abstractions (GenerativeModel, tools, instructions) under draive/models/

Applied to files:

  • src/draive/generation/model/default.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Log around generation calls, tool dispatch, and provider requests/responses without leaking secrets; prefer structured/concise messages

Applied to files:

  • src/draive/generation/model/default.py
  • src/draive/generation/audio/state.py
  • src/draive/generation/image/state.py
  • src/draive/generation/model/state.py
  • src/draive/generation/text/state.py
  • src/draive/models/tools/function.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Use ctx.scope(...) to bind scoped Disposables and active State; avoid global state

Applied to files:

  • src/draive/generation/audio/state.py
  • src/draive/generation/image/state.py
  • src/draive/generation/model/state.py
  • src/draive/generation/text/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Import Haiway symbols directly (from haiway import State, ctx)

Applied to files:

  • src/draive/generation/audio/state.py
  • src/draive/conversation/completion/state.py
  • src/draive/generation/image/default.py
  • src/draive/generation/image/state.py
  • src/draive/generation/model/state.py
  • src/draive/generation/text/state.py
  • src/draive/utils/memory.py
  • src/draive/generation/audio/default.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Access active state via haiway.ctx inside async scopes (ctx.scope(...))

Applied to files:

  • src/draive/generation/audio/state.py
  • src/draive/conversation/completion/state.py
  • src/draive/generation/image/default.py
  • src/draive/generation/image/state.py
  • src/draive/generation/model/state.py
  • src/draive/generation/text/state.py
  • src/draive/utils/memory.py
  • src/draive/models/tools/function.py
  • src/draive/generation/audio/default.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Model immutable data/config and facades with haiway.State; provide ergonomic classmethods like .of(...)

Applied to files:

  • src/draive/generation/audio/state.py
  • src/draive/generation/image/state.py
  • src/draive/generation/model/state.py
  • src/draive/generation/text/state.py
  • src/draive/utils/memory.py
  • src/draive/models/tools/function.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Use statemethod for public state methods that dispatch on the active instance

Applied to files:

  • src/draive/generation/audio/state.py
  • src/draive/generation/image/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/conversation/**/*.py : Implement higher-level chat/realtime conversations under draive/conversation/

Applied to files:

  • src/draive/conversation/completion/state.py
  • src/draive/conversation/realtime/state.py
  • src/draive/conversation/completion/default.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Use structured concurrency and valid coroutine usage; rely on haiway/asyncio; avoid custom threading

Applied to files:

  • src/draive/conversation/completion/state.py
  • src/draive/models/tools/function.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/__init__.py : Update src/draive/__init__.py exports when API surface changes

Applied to files:

  • src/draive/generation/image/default.py
  • src/draive/utils/vector_index.py
  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to tests/**/*.py : Prefer scoping with ctx.scope(...) in async tests and bind required State instances explicitly

Applied to files:

  • src/draive/generation/image/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/embedding/**/*.py : Keep vector ops, similarity, indexing, and typed embedding states under draive/embedding/

Applied to files:

  • src/draive/openai/embedding.py
  • src/draive/cohere/embedding.py
  • src/draive/vllm/embedding.py
  • src/draive/utils/vector_index.py
  • src/draive/gemini/embedding.py
  • src/draive/ollama/embedding.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Construct multimodal content with MultimodalContent.of(...) and compose blocks explicitly

Applied to files:

  • src/draive/vllm/messages.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/utils/**/*.py : Keep utilities (e.g., Memory, VectorIndex) under draive/utils/

Applied to files:

  • src/draive/utils/vector_index.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Add metrics via ctx.record where applicable

Applied to files:

  • src/draive/openai/realtime.py
  • src/draive/utils/memory.py
  • src/draive/ollama/embedding.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Avoid in-place mutation; use State.updated(...) or functional builders to produce new instances

Applied to files:

  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Prefer Mapping/Sequence/Iterable in public types over dict/list/set

Applied to files:

  • src/draive/generation/model/state.py
📚 Learning: 2025-10-03T08:51:45.502Z
Learnt from: CR
Repo: miquido/draive PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-03T08:51:45.502Z
Learning: Applies to src/draive/**/*.py : Never log secrets or full request bodies containing keys/tokens

Applied to files:

  • src/draive/models/tools/function.py
📚 Learning: 2025-06-16T10:28:07.434Z
Learnt from: KaQuMiQ
Repo: miquido/draive PR: 338
File: src/draive/lmm/__init__.py:1-2
Timestamp: 2025-06-16T10:28:07.434Z
Learning: The draive project requires Python 3.12+ as specified in pyproject.toml with "requires-python = ">=3.12"" and uses Python 3.12+ specific features like PEP 695 type aliases and generic syntax extensively throughout the codebase.

Applied to files:

  • pyproject.toml
🧬 Code graph analysis (16)
src/draive/stages/stage.py (1)
src/draive/multimodal/templates/types.py (1)
  • Template (39-139)
src/draive/generation/model/default.py (4)
src/draive/models/types.py (4)
  • ModelOutput (573-672)
  • ModelInput (429-501)
  • content (474-478)
  • content (618-622)
src/draive/models/generative.py (5)
  • loop (186-195)
  • loop (198-207)
  • loop (211-220)
  • loop (223-232)
  • loop (235-307)
src/draive/multimodal/content.py (8)
  • of (41-65)
  • of (618-646)
  • MultimodalContent (24-591)
  • artifacts (207-212)
  • artifacts (215-221)
  • artifacts (223-272)
  • to_str (287-296)
  • to_str (652-665)
src/draive/parameters/model.py (3)
  • to_json (420-436)
  • from_json (373-388)
  • to_str (459-460)
src/draive/generation/audio/state.py (1)
src/draive/multimodal/templates/types.py (1)
  • Template (39-139)
src/draive/conversation/completion/state.py (5)
src/draive/conversation/types.py (5)
  • ConversationMessage (139-200)
  • user (157-172)
  • of (33-52)
  • of (78-89)
  • of (117-130)
src/draive/multimodal/templates/repository.py (6)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
src/draive/utils/memory.py (5)
  • constant (54-66)
  • Memory (52-145)
  • recall (70-73)
  • recall (76-79)
  • recall (82-90)
src/draive/multimodal/templates/types.py (2)
  • of (56-84)
  • Template (39-139)
src/draive/models/tools/toolbox.py (1)
  • Toolbox (20-467)
src/draive/conversation/realtime/state.py (4)
src/draive/utils/memory.py (5)
  • Memory (52-145)
  • recall (70-73)
  • recall (76-79)
  • recall (82-90)
  • constant (54-66)
src/draive/models/types.py (3)
  • ModelMemoryRecall (719-766)
  • ModelInput (429-501)
  • ModelOutput (573-672)
src/draive/conversation/completion/state.py (1)
  • model_context_elements (114-120)
src/draive/multimodal/templates/repository.py (4)
  • TemplatesRepository (61-435)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
src/draive/generation/image/default.py (3)
src/draive/models/types.py (4)
  • ModelOutput (573-672)
  • ModelInput (429-501)
  • content (474-478)
  • content (618-622)
src/draive/models/generative.py (5)
  • completion (62-71)
  • completion (74-83)
  • completion (87-96)
  • completion (99-108)
  • completion (111-182)
src/draive/multimodal/content.py (3)
  • of (41-65)
  • of (618-646)
  • images (143-158)
src/draive/generation/image/state.py (2)
src/draive/multimodal/templates/types.py (1)
  • Template (39-139)
src/draive/multimodal/templates/repository.py (7)
  • TemplatesRepository (61-435)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
src/draive/vllm/messages.py (1)
src/draive/parameters/function.py (1)
  • arguments (101-102)
src/draive/mistral/completions.py (2)
src/draive/conversation/types.py (1)
  • model (175-190)
src/draive/parameters/function.py (1)
  • arguments (101-102)
src/draive/generation/text/default.py (3)
src/draive/models/types.py (2)
  • ModelOutput (573-672)
  • ModelInput (429-501)
src/draive/models/generative.py (5)
  • loop (186-195)
  • loop (198-207)
  • loop (211-220)
  • loop (223-232)
  • loop (235-307)
src/draive/multimodal/content.py (5)
  • of (41-65)
  • of (618-646)
  • MultimodalContent (24-591)
  • to_str (287-296)
  • to_str (652-665)
src/draive/conversation/completion/default.py (6)
src/draive/models/types.py (8)
  • ModelMemoryRecall (719-766)
  • ModelInput (429-501)
  • content (474-478)
  • content (618-622)
  • ModelOutput (573-672)
  • content_with_reasoning (625-646)
  • ModelReasoning (518-562)
  • ModelToolRequest (311-353)
src/draive/utils/memory.py (6)
  • recall (70-73)
  • recall (76-79)
  • recall (82-90)
  • remember (94-98)
  • remember (101-105)
  • remember (108-117)
src/draive/postgres/memory.py (2)
  • recall (68-89)
  • remember (91-145)
src/draive/conversation/types.py (6)
  • of (33-52)
  • of (78-89)
  • of (117-130)
  • ConversationMessage (139-200)
  • model (175-190)
  • ConversationOutputChunk (64-97)
src/draive/models/generative.py (5)
  • loop (186-195)
  • loop (198-207)
  • loop (211-220)
  • loop (223-232)
  • loop (235-307)
src/draive/multimodal/artifact.py (1)
  • ArtifactContent (11-96)
src/draive/generation/model/state.py (5)
src/draive/parameters/model.py (2)
  • json_schema (362-370)
  • simplified_schema (352-359)
src/draive/parameters/schema.py (1)
  • simplified_schema (9-26)
src/draive/multimodal/templates/types.py (3)
  • Template (39-139)
  • of (56-84)
  • of (162-194)
src/draive/multimodal/templates/repository.py (6)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
src/draive/models/tools/toolbox.py (1)
  • Toolbox (20-467)
src/draive/generation/text/state.py (4)
src/draive/multimodal/templates/types.py (3)
  • Template (39-139)
  • of (56-84)
  • of (162-194)
src/draive/multimodal/templates/repository.py (6)
  • resolve_str (256-264)
  • resolve_str (267-275)
  • resolve_str (278-343)
  • resolve (169-177)
  • resolve (180-188)
  • resolve (191-252)
src/draive/models/tools/toolbox.py (1)
  • Toolbox (20-467)
src/draive/multimodal/content.py (3)
  • of (41-65)
  • of (618-646)
  • MultimodalContent (24-591)
src/draive/ollama/embedding.py (1)
src/draive/parameters/model.py (1)
  • to_mapping (462-478)
src/draive/evaluation/scenario.py (1)
src/draive/evaluation/evaluator.py (5)
  • performance (160-172)
  • passed (113-122)
  • evaluator (731-734)
  • evaluator (738-747)
  • evaluator (750-811)
src/draive/generation/audio/default.py (2)
src/draive/models/generative.py (5)
  • completion (62-71)
  • completion (74-83)
  • completion (87-96)
  • completion (99-108)
  • completion (111-182)
src/draive/multimodal/content.py (3)
  • of (41-65)
  • of (618-646)
  • audio (160-175)
🔇 Additional comments (33)
src/draive/mistral/completions.py (2)

184-203: LGTM! Metric standardization aligns with non-streaming path.

The metric names and attributes now match the format already used in _record_usage_metrics (lines 551-570), providing consistent observability across streaming and non-streaming completion paths.


442-442: LGTM! Redundant dict() conversion removed.

The dict() wrapper was unnecessary since block.arguments is already a dictionary. This change improves clarity and aligns with the direct usage pattern seen elsewhere in the file (lines 297, 591).

src/draive/generation/audio/default.py (1)

29-32: Result processing logic is correct.

The pattern of iterating over audio resources and returning the first match, followed by raising a descriptive error if none are found, is appropriate.

src/draive/vllm/messages.py (2)

528-528: LGTM! Removed redundant dict() wrapper.

The dict() wrapper was unnecessary since json.dumps() can serialize Mapping[str, Any] directly.


536-591: The match statement is exhaustive and complete as-is.

MultimodalContentPart is defined as a type union: TextContent | ResourceReference | ResourceContent | ArtifactContent. The content_parts function handles all four members of this union with explicit match cases. Since the type is a fixed union (not a base class), the function cannot receive unexpected types—the type system enforces this statically.

There is no need for a default case or exhaustiveness check. The code is correct and follows Python's type safety guarantees.

Likely an incorrect or invalid review comment.

src/draive/openai/realtime.py (1)

109-117: LGTM! Observability instrumentation correctly applied.

The addition of ObservabilityLevel.INFO to the ctx.record call properly instruments the realtime session initialization with the appropriate telemetry level. The attributes remain unchanged and provide valuable context for session tracking.

src/draive/evaluation/scenario.py (1)

402-412: LGTM! Performance metric instrumentation is well-structured.

The ctx.record call correctly includes ObservabilityLevel.INFO and records a histogram metric for scenario performance. The attributes (passed and evaluators) provide useful dimensions for analyzing evaluation results.

src/draive/utils/memory.py (1)

86-89: LGTM! Memory instrumentation properly implemented.

The ctx.record calls with ObservabilityLevel.INFO for memory lifecycle events (recall, remember, maintenance) provide consistent observability across all memory operations. This addresses the previous review feedback effectively.

src/draive/mistral/embedding.py (2)

49-63: LGTM! Items metric and early-exit guard are well-implemented.

The embedding.items counter provides useful telemetry on the volume of text being embedded. The early exit on empty attributes is defensive programming that prevents unnecessary API calls and potential edge cases in batch processing.


65-77: LGTM! Batch metric calculation is correct.

The embedding.batches counter uses proper ceiling division (len(attributes) + batch_size - 1) // batch_size to accurately reflect the number of API calls. This metric, combined with embedding.items, enables monitoring of batching efficiency.

src/draive/generation/audio/state.py (1)

42-61: LGTM! Audio generation observability scope properly implemented.

The ctx.scope("generate_audio") wrapper provides proper context for all audio generation operations. Recording template identifiers when instructions or input are Template instances enables tracking of template usage patterns without affecting the generation logic.

src/draive/generation/image/state.py (1)

42-61: LGTM! Image generation observability scope properly implemented.

The ctx.scope("generate_image") wrapper at the state layer provides proper context for all image generation operations. This architectural placement (versus the previous location in default.py) prevents double-scoping and ensures consistent observability boundaries across the generation flow.

src/draive/generation/image/default.py (1)

21-31: LGTM! Scope refactoring correctly implemented.

Removing the ctx.scope("generate_image") wrapper from this implementation layer is correct, as the scope is now managed at the higher-level state layer (src/draive/generation/image/state.py lines 42-61). This architectural change prevents double-scoping and maintains cleaner separation between observability concerns and generation logic.

pyproject.toml (2)

8-8: LGTM! Version bump appropriate for observability enhancements.

The patch version increment from 0.91.1 to 0.91.2 is appropriate for this PR's observability instrumentation changes, which add telemetry without breaking existing APIs.


27-27: LGTM! Haiway dependency update supports new observability features.

The bump from haiway~=0.37.3 to haiway~=0.37.6 aligns with the widespread use of ObservabilityLevel throughout this PR, indicating that the updated haiway version provides the required observability infrastructure.

src/draive/generation/text/state.py (1)

48-69: LGTM! Observability instrumentation properly added.

The observability scope and template logging follow the project's established patterns and align with the PR objectives.

Based on learnings

src/draive/vllm/embedding.py (1)

39-79: LGTM! Comprehensive embedding observability added.

The batch_size attribute, items/batches metrics, and early-empty-input guard provide complete observability for the embedding pipeline and align with patterns in other provider implementations.

src/draive/openai/embedding.py (1)

39-80: LGTM! Consistent observability pattern applied.

The embedding instrumentation mirrors the implementation in other providers (vllm, gemini, etc.), ensuring uniform observability across the codebase.

src/draive/generation/text/default.py (1)

24-43: LGTM! Scope management correctly refactored.

The observability scope was moved to the caller in state.py, which provides better separation of concerns while maintaining the same observability coverage.

src/draive/gemini/embedding.py (1)

39-79: LGTM! Consistent provider instrumentation.

The observability additions follow the same pattern as OpenAI and vLLM providers, maintaining consistency across the embedding implementations.

src/draive/conversation/completion/state.py (1)

91-141: LGTM! Observability scope and template logging properly integrated.

The conversation completion flow is now correctly wrapped in an observability scope, and template identifier logging provides useful instrumentation when templates are used.

Based on learnings

src/draive/conversation/completion/default.py (1)

89-123: LGTM! Refactored flow with proper error propagation.

The inlined memory recall and context construction, combined with explicit exception propagation after logging, provides clear execution flow and proper error visibility.

src/draive/stages/stage.py (2)

350-360: LGTM! Consistent template logging across completion stages.

Template identifier logging is properly added across all completion methods (completion, prompting_completion, loopback_completion, result_completion), providing uniform observability for template usage.

Also applies to: 459-463, 557-561, 629-633


2075-2108: LGTM! Simplified model routing call.

The removal of await TemplatesRepository.resolve_str(instructions) is correct since instructions is already constructed as a plain string (lines 2084-2090), making the resolution call unnecessary.

src/draive/utils/vector_index.py (3)

1-11: LGTM: Import changes support telemetry feature.

The addition of ObservabilityLevel and ctx from haiway, along with the change from Iterable to Collection, appropriately supports the telemetry instrumentation added throughout the file.


95-102: LGTM: Telemetry instrumentation follows best practices.

The telemetry calls use ObservabilityLevel.INFO consistently and log appropriate metadata (model name, value counts, query/requirements presence) without leaking sensitive data. The placement before delegation to the underlying methods ensures all operations are tracked.

As per coding guidelines

Also applies to: 172-180, 220-227


28-28: Breaking API change verified—Collection type requirement is intentional and justified.

The parameter type has been narrowed from Iterable[Model] to Collection[Model] consistently across the protocol (line 28), overloads (lines 70, 81), and implementation (line 92). This change is justified by the len(values) call in telemetry (line 100) and properly enforced in both PostgresVectorIndex and VolatileVectorIndex.

No callers passing iterators or generators were found in the visible codebase. However, any external code or tests that pass generators/iterators to VectorIndex.index() will now fail at type-check time.

Verify that all callers in your test suite and downstream code pass actual collections (lists, tuples, sets) rather than generators or other iterables.

Also applies to: 70-70, 81-81, 92-92

src/draive/helpers/volatile_vector_index.py (2)

1-6: LGTM: Signature change aligns with VectorIndex protocol update.

The change from Iterable[Model] to Collection[Model] in the signature (line 32) and the removal of as_tuple import align with the broader API update in src/draive/utils/vector_index.py. The direct iteration over values (lines 49, 86) is appropriate for collections.

Also applies to: 32-32


49-49: LGTM: Direct iteration on values is correct.

The change from iterating over values_sequence to iterating directly over values is correct and simplifies the code by removing the intermediate as_tuple conversion. The zip calls at lines 85-89 correctly use values instead of the removed values_sequence.

Also applies to: 85-90

src/draive/conversation/realtime/state.py (2)

62-86: LGTM: Scoped observability and memory handling refactor.

The wrapping of the preparation logic inside ctx.scope("conversation_realtime") follows the coding guidelines for async boundaries. The memory handling refactor correctly distinguishes between Memory instances (recalling existing memory with variables) and iterables (constructing new memory context from messages). The generator logic properly maps user messages to ModelInput and other messages to ModelOutput, consistent with the pattern in src/draive/conversation/completion/state.py.

As per coding guidelines


93-107: LGTM: Template resolution and preparation call.

The inlined preparation call correctly resolves templates using TemplatesRepository.resolve_str with memory variables converted to strings. The conditional handling of memory_variables (lines 96-101) appropriately passes None when variables are absent, and the string conversion logic safely handles both string and non-string values.

src/draive/postgres/vector_index.py (2)

1-7: LGTM: Signature change aligns with VectorIndex protocol update.

The change from Iterable[Model] to Collection[Model] (line 77) and the removal of the as_tuple import align with the broader API update in src/draive/utils/vector_index.py. This allows direct iteration over the provided collection without intermediate conversion.

Also applies to: 77-77


92-92: LGTM: Direct iteration and zip usage on values is correct.

The changes from iterating/zipping value_models to directly using values (lines 92, 120, 137) correctly eliminate the intermediate as_tuple conversion. Both the TextEmbedding path (lines 115-123) and ImageEmbedding path (lines 132-140) properly zip the embedded results with the original values collection.

Also applies to: 115-123, 132-140

@KaQuMiQ KaQuMiQ merged commit 3bb5090 into main Nov 4, 2025
2 of 3 checks passed
@KaQuMiQ KaQuMiQ deleted the feature/metrics branch November 4, 2025 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant