Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ lint:
# Format Markdown docs and README
docs-format:
@echo '# Formatting Markdown...'
@uv run mdformat --wrap 100 README.md docs
@uv run mdformat README.md docs
@echo '...finished!'

# Lint Markdown docs and README
Expand Down
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,17 +67,17 @@ documents, handling audio or images — Draive has your back.
### Why you'll like it

- **Instruction Optimization**: Draive gives you clean ways to write and refine prompts, including
metaprompts, instruction helpers, and optimizers. You can go from raw prompt text to a reusable,
structured config in no time.
metaprompts, instruction helpers, and optimizers. You can go from raw prompt text to a reusable,
structured config in no time.
- **Composable Workflows**: Build modular flows using Stages and Tools. Every piece is reusable,
testable, and fits together seamlessly.
testable, and fits together seamlessly.
- **Tooling = Just Python**: Define a tool by writing a function. Annotate it. That’s it. Draive
handles the rest — serialization, context, and integration with LLMs.
handles the rest — serialization, context, and integration with LLMs.
- **Structured Outputs** - use Python classes for JSON outputs and flexible multimodal XML parser
for custom results transformations.
for custom results transformations.
- **Telemetry + Evaluators**: Draive logs everything you care about: timing, output shape, tool
usage, error cases. Evaluators let you benchmark or regression-test LLM behavior like a normal
part of your CI.
usage, error cases. Evaluators let you benchmark or regression-test LLM behavior like a normal
part of your CI.
- **Model-Agnostic by Design**: Built-in support for most major providers.

## 🖥️ Install
Expand Down
29 changes: 15 additions & 14 deletions docs/getting-started/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,12 @@ guide gives you the context you need before diving into installation and hands-o
## Why Draive

- **Composable foundations**: Build on immutable `State` objects and scoped context managers, so
every component remains testable and predictable.
every component remains testable and predictable.
- **Provider flexibility**: Switch between OpenAI, Anthropic, Gemini, Mistral, local models, and
more without rewriting your pipelines.
more without rewriting your pipelines.
- **Multimodal-first**: Work with text, audio, images, and artifacts through a unified content API.
- **Production guardrails**: Apply moderation, privacy, validation, and telemetry with the same
abstractions you use for generation.
abstractions you use for generation.

## What You Will Build

Expand All @@ -26,33 +26,34 @@ Throughout the getting-started journey you will assemble:
## Prerequisites

- Python 3.13+ with `uv` managing the virtual environment at `./.venv` (this repository includes a
ready-to-use setup).
ready-to-use setup).
- Familiarity with `async`/`await` and running `pytest` from the command line.
- Access tokens for the model providers you plan to use (store them in environment variables—never
hard-code secrets).
hard-code secrets).

## Core Concepts

1. **State management** – mutable-looking, immutable-under-the-hood `State` classes represent
configuration and runtime data. You update them with methods like `State.updated(...)` to keep
histories, snapshots, and metrics consistent.
configuration and runtime data. You update them with methods like `State.updated(...)` to keep
histories, snapshots, and metrics consistent.
1. **Context scoping** – `ctx.scope(...)` activates a stack of `State` instances and disposables for
a logical unit of work, ensuring structured concurrency and clean teardown.
a logical unit of work, ensuring structured concurrency and clean teardown.
1. **Generation flows** – typed facades in `draive.generation` orchestrate text, image, and audio
calls, while provider adapters translate the request to each backend.
calls, while provider adapters translate the request to each backend.
1. **Tools and multimodal content** – `MultimodalContent`, `ResourceContent`, and tool abstractions
let you stream artifacts, call Python functions, or chain agents without sacrificing type safety.
let you stream artifacts, call Python functions, or chain agents without sacrificing type
safety.
1. **Guardrails and observability** – moderation, privacy, metrics, and logging integrations keep
your application auditable. Use `ctx.log_*` for structured logs and `ctx.record` for metrics.
your application auditable. Use `ctx.log_*` for structured logs and `ctx.record` for metrics.

## Next Steps

1. Follow the [Installation](installation.md) guide to set up dependencies and the runtime
environment.
environment.
1. Walk through the quickstart notebooks and examples under `docs/cookbooks/` to see Draive in
action.
action.
1. Explore provider-specific instructions in `docs/guides/` when you are ready to connect to
production endpoints.
production endpoints.

You now have the core mental model for Draive. Continue with installation to bring the toolkit to
life.
6 changes: 3 additions & 3 deletions docs/getting-started/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,13 +43,13 @@ uv sync --all-groups --all-extras --frozen
- `draive[openai]`, `draive[openai_realtime]` for OpenAI Responses/Realtime.
- `draive[anthropic]`, `draive[anthropic_bedrock]` for Claude models (direct or via Bedrock).
- `draive[mistral]`, `draive[gemini]`, `draive[cohere]`, `draive[cohere_bedrock]` for other hosted
LLMs.
LLMs.
- `draive[bedrock]`, `draive[aws]` for AWS model/runtime integrations.
- `draive[ollama]`, `draive[vllm]` for local or self-hosted deployments.
- `draive[qdrant]`, `draive[postgres]` for vector/storage backends; add `pgvector` separately where
needed.
needed.
- `draive[httpx]`, `draive[mcp]`, `draive[opentelemetry]`, `draive[docs]` for HTTP utilities, MCP,
tracing, and docs site builds.
tracing, and docs site builds.

## Verify your environment

Expand Down
200 changes: 97 additions & 103 deletions docs/getting-started/multimodal-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,8 @@ parameters are typed as `Multimodal` for automated transformation.

!!! Note

```
Constructors and helpers such as `MultimodalContent.of(*elements: "Multimodal")` use the `Multimodal` alias to normalize any mix of multimodal parts into one consistent `MultimodalContent`.
```
Constructors and helpers such as `MultimodalContent.of(*elements: "Multimodal")` use the
`Multimodal` alias to normalize any mix of multimodal parts into one consistent `MultimodalContent`.

## Multimodal Content

Expand All @@ -40,110 +39,108 @@ can be used for filtering, splitting, replacing, and other operations.
**A few examples of MultimodalContent usage in Draive**:

- `@tool` decorator decorates functions that return `MultimodalContent` (see
[Basic tools use](../guides/BasicToolsUse.md))
[Basic tools use](../guides/BasicToolsUse.md))
- `ModelInput` and `ModelOutput` classes use `MultimodalContent`
- `TextGeneration.generate(...)` accepts `MultimodalContent` as input (see
[Basic usage](../guides/BasicUsage.md))
[Basic usage](../guides/BasicUsage.md))

`MultimodalContent` is an **easy-to-use and intelligent** class that does a lot for you under the
hood:

1. It avoids extra nesting:

```python
inner_multimodal = MultimodalContent.of("Hello world!")
print(inner_multimodal) # {'type': 'content', 'parts': [{'text': 'Hello world!', 'meta': {}}]}
outer_multimodal = MultimodalContent.of(inner_multimodal)
print(outer_multimodal) # {'type': 'content', 'parts': [{'text': 'Hello world!', 'meta': {}}]}
# (Same as the first one despite nesting)
```
```python
inner_multimodal = MultimodalContent.of("Hello world!")
print(inner_multimodal) # {'type': 'content', 'parts': [{'text': 'Hello world!', 'meta': {}}]}
outer_multimodal = MultimodalContent.of(inner_multimodal)
print(outer_multimodal) # {'type': 'content', 'parts': [{'text': 'Hello world!', 'meta': {}}]}
# (Same as the first one despite nesting)
```

1. It merges multiple parts if the types match:

```python
class User(DataModel):
first_name: str
last_name: str

content = MultimodalContent.of(
MultimodalTag.of(
MultimodalTag.of(
"Hello",
name="inner",
),
ArtifactContent.of(
User(
first_name="James",
last_name="Smith",
)
),
name="outer",
)
)

print(content)
# {
# 'type': 'content',
# 'parts': [
# {
# 'text': '<outer><inner>Hello</inner>',
# 'meta': {}
# },
# {
# 'category': 'User',
# 'artifact': {
# 'first_name': 'James',
# 'last_name': 'Smith'
# },
# 'hidden': False,
# 'meta': {}
# },
# {
# 'text': '</outer>',
# 'meta': {}
# }
# ]
# }
```

!!! Note

```
`MultimodalTag` produces text parts, so they merge with `TextContent`.
```
```python
class User(DataModel):
first_name: str
last_name: str

content = MultimodalContent.of(
MultimodalTag.of(
MultimodalTag.of(
"Hello",
name="inner",
),
ArtifactContent.of(
User(
first_name="James",
last_name="Smith",
)
),
name="outer",
)
)

print(content)
# {
# 'type': 'content',
# 'parts': [
# {
# 'text': '<outer><inner>Hello</inner>',
# 'meta': {}
# },
# {
# 'category': 'User',
# 'artifact': {
# 'first_name': 'James',
# 'last_name': 'Smith'
# },
# 'hidden': False,
# 'meta': {}
# },
# {
# 'text': '</outer>',
# 'meta': {}
# }
# ]
# }
```

!!! Note

`MultimodalTag` produces text parts, so they merge with `TextContent`.

1. Comes with a set of helper functions to speed up your work. Examples:

```python
print(multimodal.texts())
# (
# {'text': '<outer><inner>Hello</inner>', 'meta': {}},
# {'text': '</outer>', 'meta': {}}
# )
print(multimodal.tags())
# (
# {'name': 'outer', 'content': {...}, 'meta': {}},
# {'name': 'inner', 'content': {...}, 'meta': {}}
# )
print(multimodal.artifacts())
# (
# {
# 'category': 'User',
# 'artifact': {
# 'first_name': 'James',
# 'last_name': 'Smith'
# },
# 'hidden': False,
# 'meta': {}
# },
# )
```

!!! Tip

```
`MultimodalContent` has more ready-to-use methods for filtering such as `matching_meta()`, `split_by_meta()`, `without_resources()` or `audio()`. This is another argument to use `MultimodalContent` rather than other data models
```
```python
print(multimodal.texts())
# (
# {'text': '<outer><inner>Hello</inner>', 'meta': {}},
# {'text': '</outer>', 'meta': {}}
# )
print(multimodal.tags())
# (
# {'name': 'outer', 'content': {...}, 'meta': {}},
# {'name': 'inner', 'content': {...}, 'meta': {}}
# )
print(multimodal.artifacts())
# (
# {
# 'category': 'User',
# 'artifact': {
# 'first_name': 'James',
# 'last_name': 'Smith'
# },
# 'hidden': False,
# 'meta': {}
# },
# )
```

!!! Tip

`MultimodalContent` has more ready-to-use methods for filtering such as `matching_meta()`,
`split_by_meta()`, `without_resources()` or `audio()`. This is another argument to use
`MultimodalContent` rather than other data models

## Model Input

Expand All @@ -152,14 +149,12 @@ hood:
embed tool responses (`ModelToolResponse`), which return their payload as `MultimodalContent`,
letting you treat tool output the same way as regular text-and-media blocks.

![ModelInput class relationships](../diagrams/out/ModelInput.svg){style="height:400px; margin: auto;
display: block;"}
![ModelInput class relationships](../diagrams/out/ModelInput.svg){style="height:400px; margin: auto; display: block;"}

!!! note

```
Note that `ModelToolResponse` has a `content` attribute of type `MultimodalContent`. This is the reason why `@tool` decorated functions must return `MultimodalContent`.
```
Note that `ModelToolResponse` has a `content` attribute of type `MultimodalContent`. This is the
reason why `@tool` decorated functions must return `MultimodalContent`.

## Model Output

Expand All @@ -169,11 +164,10 @@ requests (`ModelToolRequest`) rely on the same container. This means the full ge
visible content to internal thinking and tool invocations - can be analysed with one coherent set of
helpers.

![ModelOutput class relationships](../diagrams/out/ModelOutput.svg){style="height:400px; margin:
auto; display: block;"}
![ModelOutput class relationships](../diagrams/out/ModelOutput.svg){style="height:400px; margin:auto; display: block;"}

!!! tip

```
There are ready-to-use methods like `without_tools()` to get model output without blocks related to tool requests and responses, or `reasoning()` to get model reasoning blocks. That can help you implement your features easily.
```
There are ready-to-use methods like `without_tools()` to get model output without blocks related to
tool requests and responses, or `reasoning()` to get model reasoning blocks. That can help you
implement your features easily.
8 changes: 2 additions & 6 deletions docs/getting-started/printing-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,7 @@ base64 payload directly or uses a redacted placeholder.

!!! Note

```
`kind` variable will be one of: `'image' | 'audio' | 'video' | ''`
```
`kind` variable will be one of: `'image' | 'audio' | 'video' | ''`

When `include_data=True`, `ResourceContent.to_str()` returns the full base64 payload:

Expand Down Expand Up @@ -131,9 +129,7 @@ return f"<{self.name}{_tag_attributes(self.meta)}>{self.content.to_str()}</{self

!!! Important

```
`MultimodalTag` is the only multimodal element that exposes metadata inline. Values stored in `meta` appear as XML-style tag attributes.
```
`MultimodalTag` is the only multimodal element that exposes metadata inline. Values stored in `meta` appear as XML-style tag attributes.

## MultimodalContent

Expand Down
4 changes: 2 additions & 2 deletions docs/guides/BasicConversation.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ only the assistant text, call `response.content.to_str()`.
## Next steps

- Swap `OpenAIResponsesConfig` for another provider module (for example `draive.mistral`) to try
different models.
different models.
- Add more tools to give the model controlled access to proprietary data or services.
- Wrap the code in an async function and trigger it from your application entrypoint or a CLI
script.
script.
Loading