-
Notifications
You must be signed in to change notification settings - Fork 4
API Reference
Complete class and method documentation for RubyLLM::Agents.
The base class for all agents. Supports two DSL styles: Simplified (recommended) and Traditional.
Set the LLM model.
model "gpt-4o"Set response randomness (0.0-2.0).
temperature 0.7Set request timeout.
timeout 60Document agent purpose (displayed in dashboard).
description "Extracts search intent from user queries"Declare previous class names for execution tracking continuity.
aliases "OldAgentName", "AncientAgentName"Returns all known names (current + aliases).
SupportBot.all_agent_names
# => ["SupportBot", "CustomerSupportAgent", "HelpDeskAgent"]Enable/disable streaming.
streaming trueEnable extended thinking for supported models.
thinking effort: :high, budget: 10000Define the user prompt with {placeholder} syntax. Parameters are auto-registered as required.
user "Search for {query} in {category}"
user { "Dynamic: #{some_method}" }Hash of options applied to the user prompt (e.g. cache_control):
user_config cache_control: { type: "ephemeral" }Deprecated alias for .user. Still works but emits a deprecation warning. Prefer .user in new code.
# Deprecated -- use `user` instead
prompt "Search for {query} in {category}"Pre-fill the assistant turn. Useful for forcing JSON output or steering the response format.
assistant "{"When an assistant prefill is set, the LLM continues from that text rather than generating from scratch. This is particularly effective for ensuring JSON output:
class JsonExtractor < ApplicationAgent
model "claude-sonnet-4-20250514"
system "Extract entities as JSON."
user "{text}"
assistant "{"
returns do
array :entities, of: :string
end
endHash of options applied to the assistant prefill (e.g. cache_control):
assistant_config cache_control: { type: "ephemeral" }Define system instructions.
system "You are a helpful assistant."One-shot convenience method. Sends a user message, calls the agent, and returns the result -- all in one step. Ideal for quick, ad-hoc queries without defining a full agent class.
# On any agent class
result = MyAgent.ask("Summarize this article: #{text}")
# With parameters
result = MyAgent.ask("Translate {text} to {language}", text: article, language: "French")
# Block form for dynamic prompts
result = MyAgent.ask { "Current time is #{Time.current}. What day is it?" }.ask is equivalent to temporarily setting the user prompt and calling .call:
# These are equivalent:
MyAgent.ask("Hello world")
MyAgent.call # when `user "Hello world"` is set on the classDefine structured output schema (alias for schema).
returns do
string :title, description: "Article title"
array :tags, of: :string
number :confidence
boolean :needs_review
endGroup all error handling (alias for reliability).
on_failure do
retries times: 3, backoff: :exponential
fallback to: ["gpt-4o-mini", "claude-3-haiku"]
timeout 30
circuit_breaker after: 5, cooldown: 5.minutes
endEnable caching with keyword syntax.
cache for: 1.hour
cache for: 30.minutes, key: [:query]Simplified callbacks (block-only).
before { |ctx| ctx.params[:timestamp] = Time.current }
after { |ctx, result| Analytics.track(result) }Enable response caching.
cache_for 1.hourDefine a parameter.
param :query, required: true
param :limit, default: 10
param :count, type: :integerOptions:
-
required: true- Parameter must be provided -
default: value- Default value if not provided -
type:- Type validation (:string,:integer,:float,:boolean,:array,:hash)
Register tools for the agent.
tools [SearchTool, CalculatorTool]Configure retry behavior.
retries max: 3, backoff: :exponential, base: 0.5, max_delay: 30.0Options:
-
max:- Maximum retry attempts -
backoff:-:exponentialor:constant -
base:- Initial delay in seconds -
max_delay:- Maximum delay cap -
on:- Array of error classes to retry
Set fallback model chain.
fallback_models "gpt-4o-mini", "claude-3-haiku"Configure circuit breaker.
circuit_breaker errors: 10, within: 60, cooldown: 300Options:
-
errors:- Error count to trip breaker -
within:- Time window in seconds -
cooldown:- Cooldown period in seconds
Set maximum time for all attempts.
total_timeout 30Block DSL for reliability configuration.
reliability do
retries max: 3, backoff: :exponential
fallback_models "gpt-4o-mini"
total_timeout 30
circuit_breaker errors: 10, within: 60, cooldown: 300
endFull callback API with method names or blocks.
before_call :validate_input
after_call { |context, response| log(response) }Execute the agent.
result = MyAgent.call(query: "test")
result = MyAgent.call(query: "test", dry_run: true)
result = MyAgent.call(query: "test", skip_cache: true)
result = MyAgent.call(query: "test", with: "image.jpg")Execute with streaming.
MyAgent.stream(query: "test") { |chunk| print chunk.content }Execute the agent (called by .call).
Override to define system prompt. Prefer the class-level system DSL instead (see Simplified DSL above).
# Preferred — class-level DSL
system "You are a helpful assistant."
# Traditional — instance method override
def system_prompt
"You are a helpful assistant."
endOverride to define user prompt. Prefer the class-level user DSL instead (see Simplified DSL above).
# Preferred — class-level DSL (v2.2+)
user "Process: {query}"
# Traditional — instance method override
def user_prompt
"Process: #{query}"
endOverride to define assistant prefill. Prefer the class-level assistant DSL instead (see Simplified DSL above).
# Preferred — class-level DSL (v2.2+)
assistant "{"
# Traditional — instance method override
def assistant_prompt
"{"
endOverride to define structured output.
def schema
@schema ||= RubyLLM::Schema.create do
string :result
end
endOverride to post-process response.
def process_response(response)
result = super(response)
result[:processed_at] = Time.current
result
endOverride to add custom metadata.
def metadata
{ user_id: user_id }
endOverride to customize cache key.
def cache_key_data
{ query: query }
endReturned by agent calls.
result.content # Parsed response
result.content[:key] # Hash-style access (recommended)
result.content.dig(:a, :b)
result[:key] # Deprecated, use content[:key]result.input_tokens # Input token count
result.output_tokens # Output token count
result.total_tokens # Total tokens
result.cached_tokens # Cached tokensresult.input_cost # Input cost (USD)
result.output_cost # Output cost (USD)
result.total_cost # Total cost (USD)result.model_id # Requested model
result.chosen_model_id # Actual model used
result.temperature # Temperature settingresult.duration_ms # Execution duration
result.started_at # Start timestamp
result.completed_at # End timestamp
result.time_to_first_token_ms # TTFT (streaming)result.success? # Did it succeed?
result.finish_reason # "stop", "length", etc.
result.streaming? # Was streaming used?
result.truncated? # Was output truncated?result.attempts_count # Number of attempts
result.used_fallback? # Was fallback used?result.tool_calls # Array of tool calls
result.tool_calls_count
result.has_tool_calls?result.thinking_text # Reasoning content
result.thinking_tokens # Tokens used for thinking
result.thinking_signature # Multi-turn signature (Claude)
result.has_thinking? # Whether thinking was usedresult.success? # true if no error
result.error? # true if errored
result.error_class # Exception class name
result.error_message # Exception messageresult.to_h # All data as hashActiveRecord model for execution records.
# Time-based
.today
.yesterday
.this_week
.this_month
.last_7_days
.last_30_days
.between(start, finish)
# Status
.successful
.failed
.status_error
.status_timeout
.status_running
# Agent/Model
.by_agent("AgentName")
.by_model("gpt-4o")
# Performance
.expensive(threshold)
.slow(milliseconds)
.high_token_usage(count)
# Streaming
.streaming
.non_streamingexecution.agent_type # String
execution.model_id # String
execution.status # String: success, error, timeout
execution.input_tokens # Integer
execution.output_tokens # Integer
execution.cached_tokens # Integer
execution.input_cost # Decimal
execution.output_cost # Decimal
execution.total_cost # Decimal
execution.duration_ms # Integer
execution.parameters # Hash (JSONB)
execution.system_prompt # Text
execution.user_prompt # Text
execution.response # Text
execution.error_message # Text
execution.error_class # String
execution.metadata # Hash (JSONB)
execution.streaming # Boolean
execution.time_to_first_token_ms # Integer (from metadata JSON)
execution.attempts # Array (JSONB, from execution_details)
execution.chosen_model_id # String
execution.finish_reason # String
execution.created_at # DateTime# Reports
Execution.daily_report
Execution.cost_by_agent(period: :today)
Execution.cost_by_model(period: :this_week)
Execution.stats_for("AgentName", period: :today)
Execution.trend_analysis(agent_type: "Agent", days: 7)
# Analytics
Execution.streaming_rate
Execution.avg_time_to_first_tokenBudget management.
# Check status
BudgetTracker.status
BudgetTracker.status(agent_type: "MyAgent")
# Check remaining
BudgetTracker.remaining_budget(:global, :daily)
BudgetTracker.remaining_budget(:per_agent, :daily, "MyAgent")
# Check exceeded
BudgetTracker.exceeded?(:global, :daily)Circuit breaker management.
# Check status
CircuitBreaker.status("gpt-4o")
# => { state: :open, errors: 10, closes_at: Time }
# Manual control
CircuitBreaker.open!("gpt-4o")
CircuitBreaker.close!("gpt-4o")
CircuitBreaker.reset_all!Global configuration. As of v2.1.0, this is the single entry point for all settings, including LLM provider API keys.
RubyLLM::Agents.configure do |config|
# API Keys (v2.1+ — forwarded to RubyLLM automatically)
config.openai_api_key = ENV["OPENAI_API_KEY"]
config.anthropic_api_key = ENV["ANTHROPIC_API_KEY"]
config.gemini_api_key = ENV["GOOGLE_API_KEY"]
# Defaults
config.default_model = "gpt-4o"
config.default_temperature = 0.0
config.default_timeout = 60
config.default_streaming = false
# Caching
config.cache_store = Rails.cache
# Logging
config.async_logging = true
config.retention_period = 30.days
config.persist_prompts = true
config.persist_responses = true
# Anomaly Detection
config.anomaly_cost_threshold = 5.00
config.anomaly_duration_threshold = 10_000
# Dashboard
config.dashboard_auth = ->(c) { c.current_user&.admin? }
config.dashboard_parent_controller = "ApplicationController"
config.per_page = 25
# Budgets
config.budgets = {
global_daily: 100.0,
enforcement: :hard
}
# Alerts
config.on_alert = ->(event, payload) {
# Handle alerts (Slack, PagerDuty, etc.)
}
endSee Configuration for the full list of options including all 22 forwarded provider attributes.
Rename an agent in the database, updating execution records and tenant budget keys.
# Rename permanently
RubyLLM::Agents.rename_agent("OldAgent", to: "NewAgent")
# => { executions_updated: 1432, tenants_updated: 3 }
# Dry run (no changes)
RubyLLM::Agents.rename_agent("OldAgent", to: "NewAgent", dry_run: true)
# => { executions_affected: 1432, tenants_affected: 3 }Parameters:
-
old_name(String) — The previous agent class name -
to:(String) — The new agent class name -
dry_run:(Boolean, default: false) — If true, returns counts without modifying data
Module extended into all agent classes via BaseAgent. Provides class methods for querying execution history.
Returns an ActiveRecord::Relation scoped to this agent's executions.
SearchAgent.executions
SearchAgent.executions.successful.todayReturns the most recent execution.
SearchAgent.last_run
# => #<RubyLLM::Agents::Execution ...>Returns recent failed executions within the time window.
SearchAgent.failures
SearchAgent.failures(since: 7.days)Returns total cost for this agent, optionally within a time window.
SearchAgent.total_spent # => 12.50
SearchAgent.total_spent(since: 1.month) # => 3.25Returns a summary hash.
SearchAgent.stats
# => { total:, successful:, failed:, success_rate:, avg_duration_ms:,
# avg_cost:, total_cost:, total_tokens:, avg_tokens: }Returns cost breakdown grouped by model.
SearchAgent.cost_by_model
# => { "gpt-4o" => { count: 100, total_cost: 5.00, avg_cost: 0.05 } }Filters executions by parameter values in execution details.
SearchAgent.with_params(user_id: "u123", category: "billing")Concern included in Execution. Adds replay capabilities.
Re-executes the agent with original parameters. Accepts model/temperature overrides and parameter overrides.
execution.replay
execution.replay(model: "gpt-4o-mini")
execution.replay(query: "new search term")Raises: RubyLLM::Agents::ReplayError if the agent class is missing, detail record is absent, or agent_type is blank.
Returns true if the execution has a valid agent class and detail record.
execution.replayable? # => trueReturns true if this execution is a replay of another.
execution.replay? # => falseReturns the original execution this was replayed from, or nil.
execution.replay_source # => #<Execution ...> or nilReturns all executions that are replays of this one.
execution.replays # => ActiveRecord::RelationRubyLLM::Agents::BudgetExceededError # Budget limit exceeded
RubyLLM::Agents::CircuitOpenError # Circuit breaker is open
RubyLLM::Agents::ReplayError # Replay validation failed- Agent DSL - DSL reference
- Configuration - Configuration guide
- Result Object - Result details
- Querying Executions - Agent-centric queries and replay