-
Notifications
You must be signed in to change notification settings - Fork 4
Best Practices
adham90 edited this page Feb 14, 2026
·
3 revisions
Guidelines for building production-ready LLM agents with RubyLLM::Agents.
Centralize shared configuration:
# app/agents/application_agent.rb
class ApplicationAgent < RubyLLM::Agents::Base
# Shared defaults
temperature 0.0
reliability do
retries max: 3, backoff: :exponential
total_timeout 60
end
# Common metadata
def metadata
{
request_id: Current.request_id,
user_id: Current.user&.id
}
end
endCatch type errors early:
class MyAgent < ApplicationAgent
param :query, type: String, required: true
param :limit, type: Integer, default: 10
param :filters, type: Hash, default: {}
endEnsure predictable responses:
def schema
@schema ||= RubyLLM::Schema.create do
string :result, description: "The processed result"
array :items, of: :string
boolean :success
end
endDon't rely on single requests:
class ProductionAgent < ApplicationAgent
model "gpt-4o"
reliability do
retries max: 3, backoff: :exponential
fallback_models "gpt-4o-mini", "claude-3-5-sonnet"
circuit_breaker errors: 10, within: 60, cooldown: 300
total_timeout 30
end
endGroup related config together:
# Good - organized
reliability do
retries max: 3, backoff: :exponential
fallback_models "gpt-4o-mini"
total_timeout 30
end
# Less clear - scattered
retries max: 3, backoff: :exponential
fallback_models "gpt-4o-mini"
total_timeout 30Prevent runaway costs:
RubyLLM::Agents.configure do |config|
config.budgets = {
global_daily: 100.0,
global_monthly: 2000.0,
per_agent_daily: { "ExpensiveAgent" => 50.0 },
enforcement: :hard
}
endReduce API calls:
class ExpensiveAgent < ApplicationAgent
cache_for 2.hours
# Custom cache key to maximize hits
def cache_key_data
{ query: query.downcase.strip }
end
endClearer intent, no deprecation warning:
# Good
cache_for 1.hour
# Deprecated
cache 1.hourTrack costs, errors, and latency:
# Mount dashboard
mount RubyLLM::Agents::Engine => "/agents"
# Set up authentication
config.dashboard_auth = ->(c) { c.current_user&.admin? }Enable filtering and debugging:
def metadata
{
user_id: user_id,
feature: "search",
source: source,
experiment_variant: experiment_variant
}
endGet notified of issues:
config.on_alert = ->(event, payload) {
case event
when :budget_hard_cap
PagerDuty.trigger(summary: "Budget exceeded")
when :breaker_open
Slack::Notifier.new(ENV['SLACK_WEBHOOK']).ping("Circuit breaker opened")
end
}Debug prompts without API calls:
result = MyAgent.call(query: "test", dry_run: true)
puts result.content[:user_prompt]
puts result.content[:system_prompt]Scaffold quickly:
rails generate ruby_llm_agents:agent search query:required limit:10
rails generate ruby_llm_agents:embedder document --dimensions 512Mock LLM responses:
RSpec.describe SearchAgent do
let(:mock_response) do
double(content: { results: [] }, input_tokens: 10, output_tokens: 5)
end
before do
allow_any_instance_of(RubyLLM::Chat).to receive(:ask).and_return(mock_response)
end
it "returns results" do
result = described_class.call(query: "test")
expect(result.content[:results]).to eq([])
end
endDisable for sensitive applications:
config.persist_prompts = false
config.persist_responses = falseImplement custom content moderation:
class SafeAgent < ApplicationAgent
before_call :check_content_safety
private
def check_content_safety(context)
# Use your preferred moderation service
result = ModerationService.check(context.params[:query])
raise "Content blocked" if result.flagged?
end
endBetter UX for chat interfaces:
class ChatAgent < ApplicationAgent
streaming true
end
ChatAgent.call(message: msg) do |chunk|
stream << chunk.content
endMatch model to task:
# Classification - fast, cheap, deterministic
class ClassifierAgent < ApplicationAgent
model "gpt-4o-mini"
temperature 0.0
end
# Creative writing - more capable
class WriterAgent < ApplicationAgent
model "gpt-4o"
temperature 0.8
end
# Simple extraction - fastest
class ExtractorAgent < ApplicationAgent
model "gemini-2.0-flash"
temperature 0.0
endSet up proper tenant resolution:
config.multi_tenancy_enabled = true
config.tenant_resolver = -> { Current.tenant&.id }Prevent tenant cost overruns:
RubyLLM::Agents::Tenant.create!(
tenant_id: "acme_corp",
name: "Acme Corp",
daily_limit: 50.0,
monthly_limit: 500.0,
enforcement: "hard"
)Update deprecated methods:
# Deprecated
cache 1.hour
result[:key]
result.dig(:a, :b)
# Preferred
cache_for 1.hour
result.content[:key]
result.content.dig(:a, :b)Silence warnings during migration:
RubyLLM::Agents::Deprecations.silenced = true- Testing Agents - Testing patterns
- Production Deployment - Deployment guide
- Error Handling - Error recovery
- Configuration - All settings