Build production-ready AI agents in minutes using your own hardware
What you can build:
- Copy editors that analyze manuscripts and track writing patterns
- Git commit generators that write meaningful commit messages
- Market analyzers that research competitors and summarize findings
- Code reviewers, data analysts, research assistants, and more
Why local?
- No API costs - use your hardware, not OpenAI's
- Privacy - your data never leaves your machine
- Control - pick your model (Qwen, Llama, Mistral, etc.)
How fast? From zero to working agent in under 5 minutes. Familiar patterns (inspired by Claude SDK), batteries-included features (streaming, tools, hooks, auto-execution), and production-ready quality.
Open Agent SDK provides a clean, streaming API for working with OpenAI-compatible local model servers. Drop-in similar patterns to popular SDKs, working with LM Studio, Ollama, llama.cpp, and any OpenAI-compatible endpoint—complete with streaming, tool call aggregation, hooks, and automatic tool execution.
- LM Studio -
http://localhost:1234/v1 - Ollama -
http://localhost:11434/v1 - llama.cpp server - OpenAI-compatible mode
- vLLM - OpenAI-compatible API
- Text Generation WebUI - OpenAI extension
- Any OpenAI-compatible local endpoint
- Local gateways proxying cloud models - e.g., Ollama or custom gateways that route to cloud providers
- Claude/OpenAI direct - Use their official SDKs, unless you proxy through a local OpenAI-compatible gateway
- Cloud provider SDKs - Bedrock, Vertex, Azure, etc. (proxied via local gateway is fine)
pip install open-agent-sdkFor development:
git clone https://github.com/slb350/open-agent-sdk.git
cd open-agent-sdk
pip install -e .import asyncio
from open_agent import query, AgentOptions
async def main():
options = AgentOptions(
system_prompt="You are a professional copy editor",
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1",
max_turns=1,
temperature=0.1
)
result = query(prompt="Analyze this text...", options=options)
response_text = ""
async for msg in result:
if hasattr(msg, 'content'):
for block in msg.content:
if hasattr(block, 'text'):
response_text += block.text
print(response_text)
asyncio.run(main())from open_agent import Client, AgentOptions, TextBlock, ToolUseBlock
from open_agent.config import get_base_url
def run_my_tool(name: str, params: dict) -> dict:
# Replace with your tool execution logic
return {"result": f"stub output for {name}"}
async def main():
options = AgentOptions(
system_prompt="You are a helpful assistant",
model="kimi-k2:1t-cloud", # Use your available Ollama model
base_url=get_base_url(provider="ollama"),
max_turns=10
)
async with Client(options) as client:
await client.query("What's the capital of France?")
async for msg in client.receive_messages():
if isinstance(msg, TextBlock):
print(f"Assistant: {msg.text}")
elif isinstance(msg, ToolUseBlock):
print(f"Tool used: {msg.name}")
tool_result = run_my_tool(msg.name, msg.input)
client.add_tool_result(msg.id, tool_result)
asyncio.run(main())See examples/tool_use_agent.py for progressively richer patterns (manual loop, helper function, and reusable agent class) demonstrating add_tool_result() in context.
Define tools using the @tool decorator for clean, type-safe function calling:
from open_agent import tool, Client, AgentOptions, TextBlock, ToolUseBlock
# Define tools
@tool("get_weather", "Get current weather", {"location": str, "units": str})
async def get_weather(args):
return {
"temperature": 72,
"conditions": "sunny",
"units": args["units"]
}
@tool("calculate", "Perform calculation", {"a": float, "b": float, "op": str})
async def calculate(args):
ops = {"+": lambda a, b: a + b, "-": lambda a, b: a - b}
result = ops[args["op"]](args["a"], args["b"])
return {"result": result}
# Enable automatic tool execution (recommended)
options = AgentOptions(
system_prompt="You are a helpful assistant with access to tools.",
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1",
tools=[get_weather, calculate],
auto_execute_tools=True, # 🔥 Tools execute automatically
max_tool_iterations=10 # Safety limit for tool loops
)
async with Client(options) as client:
await client.query("What's 25 + 17?")
# Simply iterate - tools execute automatically!
async for block in client.receive_messages():
if isinstance(block, ToolUseBlock):
print(f"Tool called: {block.name}")
elif isinstance(block, TextBlock):
print(f"Response: {block.text}")Advanced: Manual Tool Execution
For custom execution logic or result interception:
# Disable auto-execution
options = AgentOptions(
system_prompt="You are a helpful assistant with access to tools.",
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1",
tools=[get_weather, calculate],
auto_execute_tools=False # Manual mode
)
async with Client(options) as client:
await client.query("What's 25 + 17?")
async for block in client.receive_messages():
if isinstance(block, ToolUseBlock):
# You execute the tool manually
tool = {"calculate": calculate, "get_weather": get_weather}[block.name]
result = await tool.execute(block.input)
# Return result to agent
await client.add_tool_result(block.id, result)
# Continue conversation
await client.query("")Key Features:
- Automatic execution (v0.3.0+) - Tools run automatically with safety limits
- Type-safe schemas - Simple Python types (
str,int,float,bool) or full JSON Schema - OpenAI-compatible - Works with any OpenAI function calling endpoint
- Clean decorator API - Similar to Claude SDK's
@tool - Hook integration - PreToolUse/PostToolUse hooks work in both modes
See examples/calculator_tools.py and examples/simple_tool.py for complete examples.
Local models have fixed context windows (typically 8k-32k tokens). The SDK provides opt-in utilities for manual history management—no silent mutations, you stay in control.
from open_agent import Client, AgentOptions
from open_agent.context import estimate_tokens, truncate_messages
async with Client(options) as client:
# Long conversation...
for i in range(50):
await client.query(f"Question {i}")
async for msg in client.receive_messages():
pass
# Check token usage
tokens = estimate_tokens(client.history)
print(f"Context size: ~{tokens} tokens")
# Manually truncate when needed
if tokens > 28000:
client.message_history = truncate_messages(client.history, keep=10)1. Stateless Agents (Best for single-task agents):
# Process each task independently - no history accumulation
for task in tasks:
async with Client(options) as client:
await client.query(task)
# Client disposed, fresh context for next task2. Manual Truncation (At natural breakpoints):
from open_agent.context import truncate_messages
async with Client(options) as client:
for task in tasks:
await client.query(task)
# Truncate after each major task
client.message_history = truncate_messages(client.history, keep=5)3. External Memory (RAG-lite for research agents):
# Store important facts in database, keep conversation context small
database = {}
async with Client(options) as client:
await client.query("Research topic X")
# Save response to database
database["topic_x"] = extract_facts(response)
# Clear history, query database when needed
client.message_history = truncate_messages(client.history, keep=0)The SDK intentionally does not auto-compact history because:
- Domain-specific needs: Copy editors need different strategies than research agents
- Token accuracy varies: Each model family has different tokenizers
- Risk of breaking context: Silently removing messages could break tool chains
- Natural limits exist: Compaction doesn't bypass model context windows
For better token estimation accuracy (optional):
pip install open-agent-sdk[context] # Adds tiktokenWithout tiktoken, falls back to character-based approximation (~75-85% accurate).
See examples/context_management.py for complete patterns and usage.
Monitor and control agent behavior at key execution points with Pythonic lifecycle hooks—no subprocess overhead or JSON protocols.
from open_agent import (
AgentOptions, Client,
PreToolUseEvent, PostToolUseEvent, UserPromptSubmitEvent,
HookDecision,
HOOK_PRE_TOOL_USE, HOOK_POST_TOOL_USE, HOOK_USER_PROMPT_SUBMIT
)
# Security gate - block dangerous operations
async def security_gate(event: PreToolUseEvent) -> HookDecision | None:
if event.tool_name == "delete_file":
return HookDecision(
continue_=False,
reason="Delete operations require approval"
)
return None # Allow by default
# Audit logger - track all tool executions
async def audit_logger(event: PostToolUseEvent) -> None:
print(f"Tool executed: {event.tool_name} -> {event.tool_result}")
return None
# Input sanitizer - validate user prompts
async def sanitize_input(event: UserPromptSubmitEvent) -> HookDecision | None:
if "DELETE" in event.prompt.upper():
return HookDecision(
continue_=False,
reason="Dangerous keywords detected"
)
return None
# Register hooks in AgentOptions
options = AgentOptions(
system_prompt="You are a helpful assistant",
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1",
tools=[my_file_tool, my_search_tool],
hooks={
HOOK_PRE_TOOL_USE: [security_gate],
HOOK_POST_TOOL_USE: [audit_logger],
HOOK_USER_PROMPT_SUBMIT: [sanitize_input],
}
)
async with Client(options) as client:
await client.query("Write to /etc/config") # UserPromptSubmit fires
async for block in client.receive_messages():
if isinstance(block, ToolUseBlock): # PreToolUse fires
result = await tool.execute(block.input)
await client.add_tool_result(block.id, result) # PostToolUse firesPreToolUse - Fires before tool execution (or yielding to user)
- Block operations: Return
HookDecision(continue_=False, reason="...") - Modify inputs: Return
HookDecision(modified_input={...}, reason="...") - Allow: Return
None
PostToolUse - Fires after tool result added to history
- Observational only (tool already executed)
- Use for audit logging, metrics, result validation
- Return
None(decision ignored for PostToolUse)
UserPromptSubmit - Fires before sending prompt to API
- Block prompts: Return
HookDecision(continue_=False, reason="...") - Modify prompts: Return
HookDecision(modified_prompt="...", reason="...") - Allow: Return
None
Pattern 1: Redirect to Sandbox
async def redirect_to_sandbox(event: PreToolUseEvent) -> HookDecision | None:
"""Redirect file operations to safe directory."""
if event.tool_name == "file_writer":
path = event.tool_input.get("path", "")
if not path.startswith("/tmp/"):
safe_path = f"/tmp/sandbox/{path.lstrip('/')}"
return HookDecision(
modified_input={"path": safe_path, "content": event.tool_input.get("content", "")},
reason="Redirected to sandbox"
)
return NonePattern 2: Compliance Audit Log
audit_log = []
async def compliance_logger(event: PostToolUseEvent) -> None:
"""Log all tool executions for compliance."""
audit_log.append({
"timestamp": datetime.now(),
"tool": event.tool_name,
"input": event.tool_input,
"result": str(event.tool_result)[:100],
"user": get_current_user()
})
return NonePattern 3: Safety Instructions
async def add_safety_warning(event: UserPromptSubmitEvent) -> HookDecision | None:
"""Add safety instructions to risky prompts."""
if "write" in event.prompt.lower() or "delete" in event.prompt.lower():
safe_prompt = event.prompt + " (Please confirm this is safe before proceeding)"
return HookDecision(
modified_prompt=safe_prompt,
reason="Added safety warning"
)
return None- Hooks run sequentially in the order registered
- First non-None decision wins (short-circuit behavior)
- Hooks run inline on event loop (spawn tasks for heavy work)
- Works with both Client and query() function
Client.add_tool_result() is now async to support PostToolUse hooks:
# Old (v0.2.3 and earlier)
client.add_tool_result(tool_id, result)
# New (v0.2.4+)
await client.add_tool_result(tool_id, result)- Security gates: Block dangerous operations before they execute
- Audit logging: Track all tool executions for compliance
- Input validation: Sanitize user prompts before processing
- Monitoring: Observe agent behavior in production
- Control flow: Modify tool inputs or redirect operations
See examples/hooks_example.py for 4 comprehensive patterns (security, audit, sanitization, combined).
Cancel long-running operations cleanly without corrupting client state. Perfect for timeouts, user cancellations, or conditional interruptions.
from open_agent import Client, AgentOptions
import asyncio
async def main():
options = AgentOptions(
system_prompt="You are a helpful assistant.",
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1"
)
async with Client(options) as client:
await client.query("Write a detailed 1000-word essay...")
# Timeout after 5 seconds
try:
async def collect_messages():
async for block in client.receive_messages():
print(block.text, end="", flush=True)
await asyncio.wait_for(collect_messages(), timeout=5.0)
except asyncio.TimeoutError:
await client.interrupt() # Clean cancellation
print("\n⚠️ Operation timed out!")
# Client is still usable after interrupt
await client.query("Short question?")
async for block in client.receive_messages():
print(block.text)1. Timeout-Based Interruption
try:
await asyncio.wait_for(process_messages(client), timeout=10.0)
except asyncio.TimeoutError:
await client.interrupt()
print("Operation timed out")2. Conditional Interruption
# Stop if response contains specific content
full_text = ""
async for block in client.receive_messages():
full_text += block.text
if "error" in full_text.lower():
await client.interrupt()
break3. User Cancellation (from separate task)
async def stream_task():
await client.query("Long task...")
async for block in client.receive_messages():
print(block.text, end="")
async def cancel_button_task():
await asyncio.sleep(2.0) # User waits 2 seconds
await client.interrupt() # User clicks cancel
# Run both concurrently
await asyncio.gather(stream_task(), cancel_button_task())4. Interrupt During Auto-Execution
options = AgentOptions(
tools=[slow_tool, fast_tool],
auto_execute_tools=True,
max_tool_iterations=10
)
async with Client(options) as client:
await client.query("Use tools...")
tool_count = 0
async for block in client.receive_messages():
if isinstance(block, ToolUseBlock):
tool_count += 1
if tool_count >= 2:
await client.interrupt() # Stop after 2 tools
breakWhen you call client.interrupt():
- Active stream closure - HTTP stream closed immediately (not just a flag)
- Clean state - Client remains in valid state for reuse
- Partial output - Text blocks flushed to history, incomplete tools skipped
- Idempotent - Safe to call multiple times
- Concurrent-safe - Can be called from separate asyncio tasks
See examples/interrupt_demo.py for 5 comprehensive patterns:
- Timeout-based interruption
- Conditional interruption
- Auto-execution interruption
- Concurrent interruption (simulated cancel button)
- Interrupt and retry
We've included two production-ready agents that demonstrate real-world usage:
Analyzes your staged git changes and writes professional commit messages following conventional commit format.
# Stage your changes
git add .
# Run the agent
python examples/git_commit_agent.py
# Output:
# ✓ Found staged changes in 3 file(s)
# 🤖 Analyzing changes and generating commit message...
#
# 📝 Suggested commit message:
# feat(auth): Add OAuth2 integration with refresh tokens
#
# - Implement token refresh mechanism
# - Add secure cookie storage for tokens
# - Update login flow to support OAuth2 providers
# - Add tests for token expiration handlingFeatures:
- Analyzes diff to determine commit type (feat/fix/docs/etc)
- Writes clear, descriptive commit messages
- Interactive mode: accept, edit, or regenerate
- Follows conventional commit standards
examples/log_analyzer_agent.py
Intelligently analyzes application logs to identify patterns, errors, and provide actionable insights.
# Analyze a log file
python examples/log_analyzer_agent.py /var/log/app.log
# Analyze with a specific time window
python examples/log_analyzer_agent.py app.log --since "2025-10-15T00:00:00" --until "2025-10-15T12:00:00"
# Interactive mode for drilling down
python examples/log_analyzer_agent.py app.log --interactiveFeatures:
- Automatic error pattern detection
- Time-based analysis (peak error times)
- Root cause suggestions
- Interactive mode for investigating specific issues
- Supports multiple log formats (JSON, Apache, syslog, etc)
- Time range filtering with
--since/--until
Sample Output:
📊 Log Summary:
Total entries: 45,231
Errors: 127 (0.3%)
Warnings: 892
🔴 Top Error Patterns:
- Connection Error: 67 occurrences
- NullPointerException: 23 occurrences
- Timeout Error: 19 occurrences
⏰ Peak error time: 2025-10-15T14:00:00
Errors in that hour: 43
🤖 ANALYSIS REPORT:
Main Issues (Priority Order):
1. Database connection pool exhaustion during peak hours
2. Unhandled null values in user authentication flow
3. External API timeouts affecting payment processing
Recommendations:
1. Increase connection pool size from 10 to 25
2. Add null checks in AuthService.validateUser() method
3. Implement circuit breaker for payment API with 30s timeout
These agents demonstrate:
- Practical Value: Solve real problems developers face daily
- Tool Integration: Show how to integrate with system commands (git, file I/O)
- Multi-turn Conversations: Interactive modes for complex analysis
- Structured Output: Parse and format LLM responses for actionable results
- Privacy-First: Keep your code and logs local while getting AI assistance
Open Agent SDK uses config helpers to provide flexible configuration via environment variables, provider shortcuts, or explicit parameters:
export OPEN_AGENT_BASE_URL="http://localhost:1234/v1"
export OPEN_AGENT_MODEL="qwen/qwen3-30b-a3b-2507"from open_agent import AgentOptions
from open_agent.config import get_model, get_base_url
# Config helpers read from environment
options = AgentOptions(
system_prompt="...",
model=get_model(), # Reads OPEN_AGENT_MODEL
base_url=get_base_url() # Reads OPEN_AGENT_BASE_URL
)from open_agent.config import get_base_url
# Use built-in defaults for common providers
options = AgentOptions(
system_prompt="...",
model="llama3.1:70b",
base_url=get_base_url(provider="ollama") # → http://localhost:11434/v1
)Available providers: lmstudio, ollama, llamacpp, vllm
# Provide fallbacks when env vars not set
options = AgentOptions(
system_prompt="...",
model=get_model("qwen2.5-32b-instruct"), # Fallback model
base_url=get_base_url(provider="lmstudio") # Fallback URL
)Configuration Priority:
- Environment variable (default behaviour)
- Fallback value passed to the config helper
- Provider default (for
base_urlonly)
Need to force a specific model even when OPEN_AGENT_MODEL is set? Call get_model("model-name", prefer_env=False) to ignore the environment variable for that lookup.
Benefits:
- Switch between dev/prod by changing environment variables
- No hardcoded URLs or model names
- Per-agent overrides when needed
See docs/configuration.md for complete guide.
Without open-agent-sdk (raw OpenAI client):
from openai import AsyncOpenAI
client = AsyncOpenAI(base_url="http://localhost:1234/v1", api_key="not-needed")
response = await client.chat.completions.create(
model="qwen2.5-32b-instruct",
messages=[{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}],
stream=True
)
async for chunk in response:
# Complex parsing of chunks
# Extract delta content
# Handle tool calls manually
# Track conversation state yourselfWith open-agent-sdk:
from open_agent import query, AgentOptions
options = AgentOptions(
system_prompt=system_prompt,
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1"
)
result = query(prompt=user_prompt, options=options)
async for msg in result:
# Clean message types (TextBlock, ToolUseBlock)
# Automatic streaming and tool call handlingValue: Familiar patterns + Less boilerplate + Easy migration
Open Agent SDK and LangChain serve different needs:
Open Agent SDK is a focused library (~900 LOC) specifically for streaming conversations with local OpenAI-compatible models. Clean API, minimal dependencies (openai + pydantic), read the entire codebase in 10 minutes.
LangChain is a comprehensive framework (100k+ LOC) for building AI applications with 300+ integrations, RAG pipelines, document loaders, vector databases, and complex orchestration.
Use Open Agent SDK when:
- Running local models (LM Studio, Ollama, llama.cpp)
- You want claude-agent-sdk style ergonomics
- You need minimal dependencies and fast install
- Building focused agents (copy editor, log analyzer, commit writer)
- You prefer readable code over framework abstractions
Use LangChain when:
- You need RAG with vector databases (Pinecone, Chroma, etc.)
- You want pre-built integrations (Google Search, document loaders, etc.)
- Building complex multi-agent orchestration systems
- Your team already knows LangChain
Philosophy: Open Agent SDK is "do one thing well" (like Flask), LangChain is "batteries included" (like Django). Both are excellent tools for their respective use cases.
class AgentOptions:
system_prompt: str # System prompt
model: str # Model name (required)
base_url: str # OpenAI-compatible endpoint URL (required)
tools: list[Tool] = [] # Tool instances for function calling
hooks: dict[str, list[HookHandler]] = None # Lifecycle hooks for monitoring/control
auto_execute_tools: bool = False # Enable automatic tool execution (v0.3.0+)
max_tool_iterations: int = 5 # Max tool calls per query in auto mode
max_turns: int = 1 # Max conversation turns
max_tokens: int | None = 4096 # Tokens to generate (None uses provider default)
temperature: float = 0.7 # Sampling temperature
timeout: float = 60.0 # Request timeout in seconds
api_key: str = "not-needed" # Most local servers don't need thisNote: Use config helpers (get_model(), get_base_url()) for environment variable and provider support.
Simple single-turn query function.
async def query(prompt: str, options: AgentOptions) -> AsyncGeneratorReturns an async generator yielding messages.
Multi-turn conversation client with tool monitoring.
async with Client(options: AgentOptions) as client:
await client.query(prompt: str)
async for msg in client.receive_messages():
# Process messagesTextBlock- Text content from modelToolUseBlock- Tool calls from model (hasid,name,inputfields)ToolResultBlock- Tool execution results to send back to modelToolUseError- Tool call parsing error (malformed JSON, missing fields)AssistantMessage- Full message wrapper
@tool(name: str, description: str, input_schema: dict)
async def my_tool(args: dict) -> Any:
"""Tool handler function"""
return result
# Tool class
class Tool:
name: str
description: str
input_schema: dict[str, type] | dict[str, Any]
handler: Callable[[dict], Awaitable[Any]]
async def execute(arguments: dict) -> Any
def to_openai_format() -> dictSchema formats:
- Simple:
{"param": str, "count": int}- All parameters required - JSON Schema: Full schema with
type,properties,required, etc.
# Event types
@dataclass
class PreToolUseEvent:
tool_name: str
tool_input: dict[str, Any]
tool_use_id: str
history: list[dict[str, Any]]
@dataclass
class PostToolUseEvent:
tool_name: str
tool_input: dict[str, Any]
tool_result: Any
tool_use_id: str
history: list[dict[str, Any]]
@dataclass
class UserPromptSubmitEvent:
prompt: str
history: list[dict[str, Any]]
# Hook decision
@dataclass
class HookDecision:
continue_: bool = True
modified_input: dict[str, Any] | None = None
modified_prompt: str | None = None
reason: str | None = None
# Hook handler signature
HookHandler = Callable[[HookEvent], Awaitable[HookDecision | None]]
# Hook constants
HOOK_PRE_TOOL_USE = "pre_tool_use"
HOOK_POST_TOOL_USE = "post_tool_use"
HOOK_USER_PROMPT_SUBMIT = "user_prompt_submit"Hook behavior:
- Return
Noneto allow by default - Return
HookDecision(continue_=False)to block - Return
HookDecision(modified_input={...})to modify (PreToolUse) - Return
HookDecision(modified_prompt="...")to modify (UserPromptSubmit) - Raise exception to abort entirely
Local models (LM Studio, Ollama, llama.cpp):
- GPT-OSS-120B - Best in class for speed and quality
- Qwen 3 30B - Excellent instruction following, good for most tasks
- GPT-OSS-20B - Solid all-around performance
- Mistral 7B - Fast and efficient for simple agents
Cloud-proxied via local gateway (Ollama cloud provider, custom gateway):
- kimi-k2:1t-cloud - Tested and working via Ollama gateway
- deepseek-v3.1:671b-cloud - High-quality reasoning model
- qwen3-coder:480b-cloud - Code-focused models
- Your
base_urlstill points to localhost gateway (e.g.,http://localhost:11434/v1) - Gateway handles authentication and routing to cloud provider
- Useful when you need larger models than your hardware can run locally
Architecture guidance:
- Prefer MoE (Mixture of Experts) models over dense when available - significantly faster
- Start with 7B-30B models for most agent tasks - they're fast and capable
- Test models with your specific use case - the LLM landscape changes rapidly
open-agent-sdk/
├── open_agent/
│ ├── __init__.py # query, Client, AgentOptions exports
│ ├── client.py # Streaming query(), Client, tool helper
│ ├── config.py # Env/provider helpers
│ ├── context.py # Token estimation and truncation utilities
│ ├── hooks.py # Lifecycle hooks (PreToolUse, PostToolUse, UserPromptSubmit)
│ ├── tools.py # Tool decorator and schema conversion
│ ├── types.py # Dataclasses for options and blocks
│ └── utils.py # OpenAI client + ToolCallAggregator
├── docs/
│ ├── configuration.md
│ ├── provider-compatibility.md
│ └── technical-design.md
├── examples/
│ ├── git_commit_agent.py # 🌟 Practical: Git commit message generator
│ ├── log_analyzer_agent.py # 🌟 Practical: Log file analyzer
│ ├── calculator_tools.py # Function calling with @tool decorator
│ ├── simple_tool.py # Minimal tool usage example
│ ├── tool_use_agent.py # Complete tool use patterns
│ ├── context_management.py # Manual history management patterns
│ ├── hooks_example.py # Lifecycle hooks patterns (security, audit, sanitization)
│ ├── interrupt_demo.py # Interrupt capability patterns (timeout, conditional, concurrent)
│ ├── simple_lmstudio.py # Basic usage with LM Studio
│ ├── ollama_chat.py # Multi-turn chat example
│ ├── config_examples.py # Configuration patterns
│ └── simple_with_env.py # Environment variable config
├── tests/
│ ├── integration/ # Integration-style tests using fakes
│ │ └── test_client_behaviour.py # Streaming, multi-turn, tool flow coverage
│ ├── test_agent_options.py
│ ├── test_auto_execution.py # Automatic tool execution
│ ├── test_client.py
│ ├── test_config.py
│ ├── test_context.py # Context utilities (token estimation, truncation)
│ ├── test_hooks.py # Lifecycle hooks (PreToolUse, PostToolUse, UserPromptSubmit)
│ ├── test_interrupt.py # Interrupt capability (timeout, concurrent, reuse)
│ ├── test_query.py
│ ├── test_tools.py # Tool decorator and schema conversion
│ └── test_utils.py
├── CHANGELOG.md
├── pyproject.toml
└── README.md
git_commit_agent.py– Analyzes git diffs and writes professional commit messageslog_analyzer_agent.py– Parses logs, finds patterns, suggests fixes with interactive modetool_use_agent.py– Complete tool use patterns: manual, helper, and agent class
simple_lmstudio.py– Minimal streaming query with hard-coded config (simplest quickstart)simple_with_env.py– Using environment variables with config helpers and fallbacksconfig_examples.py– Comprehensive reference: provider shortcuts, priority, and all config patternsollama_chat.py– Multi-turn chat loop with Ollama, including tool-call loggingcontext_management.py– Manual history management patterns (stateless, truncation, token monitoring, RAG-lite)hooks_example.py– Lifecycle hooks patterns (security gates, audit logging, input sanitization, combined)
Located in tests/integration/:
test_client_behaviour.py– Fake AsyncOpenAI client covering streaming, multi-turn history, and tool-call flows without hitting real servers
Released v0.1.0 – Core functionality is complete and available on PyPI. Multi-turn conversations, tool monitoring, and streaming are fully implemented.
- Project planning and architecture
- Core
query()andClientclass - Tool monitoring +
Client.add_tool_result()helper - Tool use example (
examples/tool_use_agent.py) - PyPI release - Published as
open-agent-sdk - Provider compatibility matrix expansion
- Additional agent examples
- ✅ Ollama - Fully validated with
kimi-k2:1t-cloud(cloud-proxied model) - ✅ LM Studio - Fully validated with
qwen/qwen3-30bmodel - ✅ llama.cpp - Fully validated with TinyLlama 1.1B model
See docs/provider-compatibility.md for detailed test results.
- docs/technical-design.md - Architecture details
- docs/configuration.md - Configuration guide
- docs/provider-compatibility.md - Provider test results
- examples/ - Usage examples
Integration-style tests run entirely against lightweight fakes, so they are safe to execute locally and in pre-commit:
python -m pytest tests/integrationAdd -k or a specific path when you want to target a subset of the unit tests (tests/test_client.py, etc.). If you use a virtual environment, prefix commands with ./venv/bin/python -m.
Install hooks once per clone:
pip install pre-commit
pre-commit installRunning pre-commit run --all-files will execute formatting checks and the integration tests (python -m pytest tests/integration) before you push changes.
- Python 3.10+
- openai 1.0+ (for AsyncOpenAI client)
- pydantic 2.0+ (for types, optional)
- Some servers require a dummy
api_key; set any non-empty string if needed
MIT License - see LICENSE for details.
- API design inspired by claude-agent-sdk
- Built for local/open-source LLM enthusiasts
Status: Alpha - API stabilizing, feedback welcome
Star ⭐ this repo if you're building AI agents with local models!