The DevOps Agent is a sophisticated AI assistant engineered to empower developers and DevOps engineers across the full software development lifecycle, from infrastructure management to operational excellence. This production-ready agent represents the culmination of comprehensive Phase 2 development, featuring advanced context management, intelligent planning workflows, and RAG-enhanced codebase understanding.
Built on the Google Agent Development Kit (ADK) foundation with Google Gemini LLMs providing advanced reasoning capabilities, the agent utilizes ChromaDB for semantic code search and incorporates cutting-edge context management with multi-factor relevance scoring, automatic content discovery, and intelligent summarization. The result is an agent that provides contextually-aware assistance while maintaining optimal performance and user experience.
- CI/CD Automation: Streamlines your software delivery process.
- For Developers: Accelerate your development cycles. The agent can help generate pipeline configurations, troubleshoot failing builds, and automate deployment steps, getting your code to production faster.
- For Platform Engineers: Standardize and manage CI/CD pipelines with ease. The agent can assist in creating robust, reusable pipeline templates, monitoring pipeline health, and ensuring consistent deployment practices across services.
- Infrastructure Management: Simplify your cloud and on-premise infrastructure operations.
- For Developers: Quickly provision development and testing environments that mirror production. Ask the agent to generate Infrastructure-as-Code (IaC) scripts (e.g., Terraform, Ansible) for your application's needs.
- For Platform Engineers: Automate complex infrastructure tasks. The agent can assist in generating IaC for various resources, managing configurations, and providing insights into resource utilization and cost optimization.
- Codebase Understanding (via RAG with ChromaDB): Unlock deep insights into your code repositories (see Codebase Indexing and Retrieval for details on RAG).
- For Developers: Onboard to new projects faster by asking the agent about specific functionalities or module dependencies. Debug complex issues by quickly locating relevant code sections and understanding their purpose. Confidently refactor code with the agent's help in identifying usages and potential impacts.
- For Platform Engineers: Gain clarity on legacy systems for modernization projects. Identify areas for performance optimization or security hardening by analyzing code patterns and configurations. Ensure compliance by asking the agent to find specific configurations or code related to regulatory requirements.
- Workflow Automation: Reclaim time by automating routine and complex DevOps tasks.
- For Developers: Automate common tasks like generating boilerplate code, running linters/formatters, or creating pull request summaries.
- For Platform Engineers: Automate incident response procedures (e.g., log collection, service restarts), compliance checks, or resource cleanup tasks.
- Interactive Planning: Tackle complex tasks with confidence through collaborative planning.
- For Developers: Before the agent refactors a large module or implements a new feature, review and approve its proposed plan, ensuring alignment and catching potential issues early.
- For Platform Engineers: For intricate infrastructure changes or multi-step deployment processes, vet the agent's plan to ensure safety, compliance, and operational best practices are followed. See the Interactive Planning Workflow section for details.
- Advanced Context Management: Features intelligent multi-factor relevance scoring, automatic content discovery, cross-turn correlation, and intelligent summarization. The system achieves 244x improvement in token utilization while maintaining context quality through smart prioritization algorithms.
- Interactive Planning: Collaborative workflow for complex tasks with plan generation, user review, and iterative refinement before implementation. Improves task accuracy and reduces rework through upfront alignment.
- RAG-Enhanced Codebase Understanding: Deep semantic search and retrieval using ChromaDB vector storage with Google embeddings. Enables automatic project context gathering from README files, package configurations, Git history, and documentation.
- Comprehensive Tool Integration: Versatile suite including file operations, code search, vetted shell execution, codebase indexing/retrieval, and intelligent tool discovery with safety-first approach and user approval workflows.
- Proactive Context Addition: Automatically discovers and includes project files, Git history, documentation, and configuration files with zero manual intervention. Enhanced support for modern Python packaging with
uv
detection. - Token Optimization & Transparency: Dynamic token limit determination, usage transparency with detailed breakdowns, accurate counting methods, and context optimization strategies to maximize relevance within limits.
- Production-Ready Architecture: Built on Google ADK with robust error handling, comprehensive logging, full type annotations, and enterprise-grade deployment capabilities via Google Cloud Run.
- Enhanced Interactive CLI: Advanced command-line interface with multi-line input support, mouse interaction, auto-completion for DevOps workflows, command history with auto-suggestions, and intelligent keyboard shortcuts. Rich visual feedback with styled prompts, continuation indicators, and contextual help designed specifically for agentic workflow interactions.
- Enhanced User Experience: Detailed execution feedback, granular error reporting, and intelligent status indicators providing clear insight into agent operations and decision-making processes.
To get started with the DevOps Agent, ensure you have Python 3.13 (or a compatible version) and uvx
(the Universal Virtualenv Executer from the Google ADK) installed on your system. You can use uvx
to handle dependencies and run the agent without needing to install the Google ADK globally.
-
Run the Agent Locally:
Important: Make sure you have set the
GOOGLE_API_KEY
environment variable with your Google API key:export GOOGLE_API_KEY=your_api_key_here
This is required for the agent to create a GenAI client when running with the ADK. The key is loaded via the configuration system in
config.py
.Run the simple CLI: Execute the following command in your terminal:
uvx --refresh --from git+https://github.com/BlueCentre/adk-agents.git@main agent run agents.devops
Run the Full-featured TUI: Execute the following command in your terminal:
uvx --refresh --from git+https://github.com/BlueCentre/adk-agents.git@main agent run agents.devops --tui
Run agent with a web interface:
uvx --refresh --from git+https://github.com/BlueCentre/adk-agents.git@main agent web
Run agent with an API interface:
uvx --refresh --from git+https://github.com/BlueCentre/adk-agents.git@main agent api_server
These commands will set up a virtual environment with the required packages and start an interactive CLI session with the DevOps agent.
-
Deploy to Google Cloud Run: (WORK-IN-PROGRESS)
The agent can be deployed as a service to Google Cloud Run.
adk deploy cloud_run --project=[YOUR_GCP_PROJECT] --region=[YOUR_GCP_REGION] agents/devops/
Replace
[YOUR_GCP_PROJECT]
and[YOUR_GCP_REGION]
with your Google Cloud project ID and desired region. This command packages the agent and deploys it, making it accessible via an HTTP endpoint.
The DevOps Agent supports Gemini's advanced thinking capabilities for enhanced reasoning and complex problem-solving. This feature leverages Gemini 2.5 series models' internal reasoning process to provide better results for complex DevOps tasks.
Supported Models:
gemini-2.5-flash-preview-05-20
(Gemini 2.5 Flash with thinking)gemini-2.5-pro-preview-06-05
(Gemini 2.5 Pro with thinking)
Configuration:
Create or update your .env
file in the project root with these settings:
# Enable Gemini thinking (default: false)
GEMINI_THINKING_ENABLE=true
# Include thought summaries in responses (default: true)
GEMINI_THINKING_INCLUDE_THOUGHTS=true
# Set thinking budget (tokens allocated for reasoning, default: 8192)
GEMINI_THINKING_BUDGET=8192
# Use a 2.5 series model that supports thinking
AGENT_MODEL=gemini-2.5-pro-preview-06-05
# or
AGENT_MODEL=gemini-2.5-flash-preview-05-20
# Your Google API key (required)
GOOGLE_API_KEY=your_api_key_here
What Thinking Enables:
- Enhanced Problem Solving: The model can "think through" complex DevOps scenarios step-by-step before responding
- Better Planning: Improved analysis and planning for multi-step operations
- Debugging Assistance: More thorough reasoning when troubleshooting issues
- Code Analysis: Deeper understanding when analyzing complex codebases
Usage Transparency:
When thinking is enabled, you'll see:
- đź§ Enhanced usage display showing thinking tokens separately from output tokens
- Thought summaries (when
GEMINI_THINKING_INCLUDE_THOUGHTS=true
) providing insight into the model's reasoning process - Detailed token breakdown including thinking costs in logs
Display Enhancement (June 8 2025): The agent automatically filters thought summaries from the main response to prevent duplication. The thought process is displayed once in the dedicated "đź§ Agent Thought" panel, and the main response contains only the final output without redundant content.
Example with thinking enabled:
-
Create/update your
.env
file:GEMINI_THINKING_ENABLE=true AGENT_MODEL=gemini-2.5-pro-preview-06-05 GOOGLE_API_KEY=your_api_key_here
-
Run the agent:
PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python uvx --with extensions --with google-generativeai --with google-api-core --with chromadb --with protobuf --with openai --with tiktoken --no-cache --python 3.13 --from git+https://github.com/BlueCentre/adk-python.git@main adk run agents/devops
Or use the convenience script:
./scripts/execution/run.sh
Performance Considerations:
- Thinking Budget: Higher values (16384+) enable more complex reasoning but increase costs
- Token Usage: Thinking tokens are charged in addition to input/output tokens
- Response Time: Complex reasoning may take longer but produces higher quality results
- Model Selection: Gemini 2.5 Pro generally provides deeper reasoning than Flash for complex tasks
Best Use Cases for Thinking:
- Complex infrastructure planning and design
- Multi-step deployment troubleshooting
- Advanced code refactoring and optimization
- System architecture analysis and recommendations
- Security analysis and compliance checks
The DevOps Agent features an advanced command-line interface optimized for agentic workflows and complex multi-turn conversations:
TUI (Terminal User Interface) Mode:
🚀 Key Features:
- Multi-line Input Support: Use Alt+Enter to submit complex, multi-line requests perfect for detailed task descriptions
- Mouse Support: Click to position cursor, drag to select text, scroll through completion menus
- Smart Auto-completion: Tab completion for 50+ common DevOps commands organized by category
- Command History: Intelligent history with auto-suggestions based on previous interactions
- Visual Enhancements: Styled prompts, continuation indicators (" >"), and contextual help
⌨️ Keyboard Shortcuts:
Alt+Enter
- Submit multi-line inputCtrl+D
- Exit gracefullyCtrl+L
- Clear screenCtrl+C
- Cancel current inputTab
- Show command completions↑/↓
- Navigate command history
🛠️ DevOps-Optimized Completions:
The CLI includes intelligent completions for common workflows:
# Code analysis and improvement
analyze this code → "analyze this code"
review the → "review the codebase"
add error → "add error handling to"
# Infrastructure and DevOps
create a → "create a dockerfile"
setup mon → "setup monitoring for"
write terra → "write terraform code for"
# Deployment and operations
deploy to → "deploy to production", "deploy to staging"
check ser → "check service status"
troublesh → "troubleshoot deployment"
đź’ˇ Usage Examples:
-
Multi-line Complex Request:
Create a Kubernetes deployment that: - Uses a multi-container pod setup - Includes health checks and resource limits - Has proper security contexts - Implements horizontal pod autoscaling [Alt+Enter to submit]
-
Quick Commands with Completion:
setup monitoring for[Tab] → Shows completion options
-
Interactive Help:
help → Shows available commands and shortcuts clear → Clears the screen
🎯 Optimized for Agentic Workflows:
The enhanced CLI is specifically designed for complex, multi-turn conversations with AI agents, supporting:
- Long-form task descriptions spanning multiple lines
- Easy editing and refinement of complex requests
- Quick access to common DevOps workflows
- Seamless history navigation for iterative development
- Visual feedback that doesn't interfere with agent output
The DevOps Agent leverages a powerful stack of technologies to deliver its capabilities:
- Google Agent Development Kit (ADK): The foundational framework that provides core agent capabilities, including LLM integration, tool management, and execution lifecycle.
- Google Gemini Large Language Models: The advanced AI models (specifically Gemini Pro and Gemini Flash) that power the agent's understanding, reasoning, planning, and code generation abilities.
- ChromaDB: A vector database used to store embeddings of codebases, enabling powerful semantic search and retrieval (RAG) for codebase understanding features.
- Python: The primary programming language used to develop the agent and its tools.
adk-agents/ # Repository root
├── agents/devops/ # DevOps Agent implementation
│ ├── devops_agent.py # Main agent implementation (ADK LlmAgent)
│ ├── agent.py # Agent entry point and configuration
│ ├── prompts.py # Core agent instructions and persona
│ ├── config.py # Configuration management and environment setup
│ ├── components/ # Core agent components
│ │ ├── planning_manager.py # Interactive planning workflow management
│ │ └── context_management/ # Advanced context management system
│ │ ├── context_manager.py # Main context orchestration
│ │ ├── smart_prioritization.py # Multi-factor relevance scoring
│ │ ├── cross_turn_correlation.py # Turn relationship detection
│ │ ├── intelligent_summarization.py # Content-aware compression
│ │ └── dynamic_context_expansion.py # Automatic content discovery
│ ├── tools/ # Comprehensive tool suite
│ │ ├── __init__.py # Tool registration and exports
│ │ ├── rag_tools.py # RAG indexing and retrieval tools
│ │ ├── rag_components/ # ChromaDB and embedding components
│ │ │ ├── chunking.py # AST-based code chunking
│ │ │ ├── indexing.py # Vector embedding and storage
│ │ │ └── retriever.py # Semantic similarity search
│ │ ├── filesystem.py # File system operations
│ │ ├── shell_command.py # Vetted command execution
│ │ ├── code_analysis.py # Static code analysis capabilities
│ │ ├── code_search.py # Code pattern search utilities
│ │ ├── project_context.py # Project-level context gathering
│ │ └── [additional tools] # Memory, analysis, and utility tools
│ ├── shared_libraries/ # Shared utilities and common functions
│ ├── docs/ # 📚 Consolidated documentation
│ │ ├── README.md # Navigation hub and quick reference
│ │ ├── CONSOLIDATED_STATUS.md # Complete Phase 2 status and validation
│ │ ├── IMPLEMENTATION_STATUS.md # Technical implementation details
│ │ ├── CONTEXT_MANAGEMENT_STRATEGY.md # Context management architecture
│ │ ├── features/ # Feature-specific documentation
│ │ │ ├── FEATURE_AGENT_INTERACTIVE_PLANNING.md
│ │ │ ├── FEATURE_RAG.md
│ │ │ └── FEATURE_AGENT_LOOP_OPTIMIZATION.md
│ │ └── archive/ # Archived documentation
│ └── .indexignore # RAG indexing exclusion rules
├── scripts/ # 🔧 Organized utility scripts
│ ├── README.md # Scripts documentation and usage guide
│ ├── execution/ # Agent execution and deployment scripts
│ │ ├── run.sh # Local agent execution
│ │ ├── run_adk.sh # ADK-specific execution
│ │ ├── eval.sh # Evaluation and testing
│ │ ├── eval_adk.sh # ADK-specific evaluation
│ │ ├── prompt.sh # Interactive prompt testing
│ │ ├── prompt_adk.sh # ADK-specific prompt testing
│ │ ├── web_adk.sh # Web interface for ADK agent
│ │ ├── push.sh # Deployment and push automation
│ │ ├── mcp.sh # Model Context Protocol integration
│ │ ├── fix_rate_limits.sh # Rate limiting configuration
│ │ └── groom.sh # Repository grooming automation
│ ├── monitoring/ # Telemetry and performance monitoring
│ │ ├── telemetry_check.py # Health checks and validation
│ │ ├── telemetry_dashboard.py # Interactive telemetry dashboard
│ │ ├── metrics_overview.py # Comprehensive metrics analysis
│ │ ├── metrics_status.py # Real-time metrics monitoring
│ │ └── tracing_overview.py # Distributed tracing analysis
│ └── validation/ # Testing and validation scripts
│ └── validate_smart_prioritization_simple.py # Smart prioritization validation
├── example_prompts/ # 🧪 Organized test prompts
│ ├── README.md # Test prompt documentation and guidelines
│ ├── current/ # Active test prompts for ongoing features
│ │ ├── test_gemini_thinking_feature.md # Gemini thinking validation
│ │ ├── test_dynamic_discovery.md # Dynamic tool discovery testing
│ │ ├── test_context_diagnostics.md # Context management diagnostics
│ │ ├── test_planning_heuristics.md # Interactive planning validation
│ │ └── test_prompt_engineering.md # Prompt optimization testing
│ └── archive/ # Completed test prompts (Phase 2, etc.)
│ ├── test_phase2_remaining_features.md # Phase 2 feature validation (COMPLETED)
│ └── test_phase2_validation.md # Comprehensive Phase 2 testing (COMPLETED)
├── tests/ # Test suite (unit, integration, e2e)
├── eval/ # Evaluation datasets and results
├── src/ # Source package structure
└── [config files] # pyproject.toml, README.md, etc.
The DevOps Agent is architected as an LlmAgent
within the Google ADK framework. Its core components are:
-
devops_agent.py
(MyDevopsAgent
): The heart of the DevOps Agent, defining theMyDevopsAgent
class which inherits from the Google ADK'sLlmAgent
. This class orchestrates all agent capabilities through custom ADK callback handlers (handle_before_model
,handle_after_model
, etc.) that manage state, integrate planning workflows, and optimize context delivery. It seamlessly coordinates thePlanningManager
and advancedContextManager
to enable sophisticated, context-aware operations with Gemini LLM integration. -
prompts.py
: Contains the static core instructions and persona definition for the Gemini LLM, establishing the agent's foundational behavior, expertise areas, and interaction patterns with users and tools. -
Advanced Context Management (
components/context_management/
): A comprehensive system featuring:- Smart Prioritization: Multi-factor relevance scoring with content, recency, frequency, error priority, and coherence weighting
- Cross-Turn Correlation: Relationship detection and pattern recognition across conversation turns
- Intelligent Summarization: Content-aware compression with 8 content type detection and keyword preservation
- Dynamic Context Expansion: 4-phase discovery process for automatic content discovery and file classification
-
Interactive Planning (
components/planning_manager.py
): Drives collaborative workflows through complexity assessment, multi-step plan generation, user review cycles, and plan refinement. Integrates seamlessly with context management for plan-guided execution. -
RAG Components (
tools/rag_components/
): Production-ready retrieval system with AST-based code chunking, ChromaDB vector storage, Google embedding integration, and semantic similarity search for deep codebase understanding. -
Comprehensive Tool Suite (
tools/
): Feature-rich collection including file operations, vetted shell execution, code analysis, project context gathering, memory management, and intelligent tool discovery with safety-first design patterns. -
Google ADK Framework: Provides the robust foundation for agent execution, tool management, LLM interaction, state management, and enterprise deployment capabilities.
The DevOps Agent is fundamentally an application built on top of the Google ADK. The ADK provides the core capabilities that make the agent functional:
- Agent Abstraction (
LlmAgent
): This is a cornerstone of the ADK. It's a high-level class for creating LLM-powered agents, handling the complexities of LLM interaction, prompt construction, tool dispatch, and managing the state of the conversation. This abstraction is key to enabling a rich and robust interactive agent loop, allowing for sophisticated multi-turn dialogues and intelligent tool chaining. - Tool Management: A system for defining, registering, and securely invoking tools that the agent can use.
- LLM Integration: Connectors and configurations for various LLMs, allowing developers to choose the model that best suits their needs.
- CLI and Deployment: Utilities for running agents locally (
adk run
) and deploying them to cloud environments like Google Cloud Run (adk deploy cloud_run
). - Session Management: (Optional) Capabilities to persist and resume agent conversations.
- Observability: (Optional) Integration with tracing and logging for monitoring agent behavior. This agent leverages this by logging detailed information about tool execution (including duration) and LLM token usage.
The MyDevopsAgent
class, which inherits from the ADK's LlmAgent
, makes extensive use of the ADK's callback mechanism to customize its behavior at specific points in the agent's execution lifecycle. This is a core aspect of its integration with the ADK framework.
How it Works:
-
Callback Registration: In its
__init__
method,MyDevopsAgent
assigns its own custom methods (e.g.,self.handle_before_model
,self.handle_after_model
,self.handle_before_tool
,self.handle_after_tool
) to the corresponding callback attributes provided by theLlmAgent
base class (e.g.,self.before_model_callback
,self.after_model_callback
). This is the standard and recommended way to register callbacks in ADK. -
Custom Logic in Callback Handlers: These custom handler methods contain the specialized logic for
MyDevopsAgent
, including:- State Management: Interacting with
callback_context.state
andtool_context.state
to manage conversation history, tool invocation details, and other contextual information. - Planning Integration: The
PlanningManager
is invoked within these callbacks (primarilyhandle_before_model
andhandle_after_model
) to interject planning steps. This manager can return specific ADK objects (likeLlmResponse
) to control the execution flow, such as skipping an LLM call if a plan is being presented or replacing an LLM response if the output is a plan. - Context Manipulation: Modifying the
LlmRequest
object inhandle_before_model
to inject assembled context before it's sent to the LLM. - UI Feedback: Interacting with UI components (console, status spinners) to provide real-time feedback to the user.
- State Management: Interacting with
Alignment with ADK Recommendations:
This approach is well-aligned with the ADK framework's design for callbacks. The ADK allows any callable (standalone functions or instance methods) to be registered as a callback. For a complex and stateful agent like MyDevopsAgent
, defining callbacks as methods within the agent's own class offers several advantages:
- Encapsulation: Keeps agent-specific logic contained within the agent class.
- State Access: Allows callbacks to easily access and modify the agent's internal state and components (like
_planning_manager
). - Organization: Groups related pre-processing and post-processing logic with the agent definition.
Instead of being an abstraction diverging from ADK's callback system, MyDevopsAgent
leverages the callback system by providing its own sophisticated implementations for the callback hooks. This demonstrates a robust use of the ADK's extensibility points to build a specialized agent.
In essence, the ADK provides the "operating system" for the agent, while devops_agent.py
, prompts.py
, config.py
, and the custom tools define the specific "application" logic and capabilities of the DevOps Agent. This separation allows developers to focus on the unique aspects of their agent without needing to rebuild common agent infrastructure.
The DevOps Agent follows a sophisticated multi-layered architecture that integrates seamlessly with the Google ADK framework while providing advanced capabilities through custom components.
graph LR
subgraph GoogleADKFramework
ADK_Core[Core Engine]
ADK_Tools[Tool Management]
ADK_LLM[LLM Integration]
ADK_CLI[CLI Deployment]
end
subgraph DevOpsAgentApplication
DevOpsAgent[devops_agent.py]
PromptPy[prompts.py]
ConfigPy[config.py]
CustomTools[Custom Tools]
ContextMgmt[Context Management]
PlanningMgr[Planning Manager]
end
DevOpsAgent --> ADK_Core
DevOpsAgent --> ADK_Tools
DevOpsAgent --> ADK_LLM
PromptPy --> DevOpsAgent
ConfigPy --> DevOpsAgent
ContextMgmt --> DevOpsAgent
PlanningMgr --> DevOpsAgent
CustomTools --> ADK_Tools
ADK_CLI --> DevOpsAgent
The agent processes requests through a sophisticated callback-driven lifecycle that enables advanced planning, context management, and error handling:
graph TD
UserReq[User Request] --> ADK[ADK Framework]
ADK --> BeforeModel[handle_before_model]
subgraph "Before Model Processing"
BeforeModel --> StateInit[Initialize State]
StateInit --> PlanCheck{Planning Needed?}
PlanCheck -- Yes --> PlanGen[Generate Plan]
PlanCheck -- No --> CtxAssembly[Assemble Context]
PlanGen --> PlanReview[Present to User]
PlanReview --> PlanApproval{User Approval?}
PlanApproval -- No --> PlanRefine[Refine Plan]
PlanRefine --> PlanReview
PlanApproval -- Yes --> CtxAssembly
CtxAssembly --> CtxInject[Inject Context into LLM Request]
end
CtxInject --> LLMCall[LLM Processing]
LLMCall --> AfterModel[handle_after_model]
subgraph "After Model Processing"
AfterModel --> ExtractResp[Extract Response]
ExtractResp --> FuncCalls{Function Calls?}
FuncCalls -- Yes --> BeforeTool[handle_before_tool]
FuncCalls -- No --> UpdateState[Update Conversation State]
end
BeforeTool --> ToolExec[Tool Execution]
ToolExec --> AfterTool[handle_after_tool]
subgraph "Tool Processing"
AfterTool --> ErrorCheck{Tool Error?}
ErrorCheck -- Yes --> ErrorHandler[Enhanced Error Handling]
ErrorCheck -- No --> ToolSuccess[Process Success]
ErrorHandler --> RetryLogic{Retry Available?}
RetryLogic -- Yes --> RetryTool[Execute Retry Tool]
RetryLogic -- No --> UserGuidance[Provide User Guidance]
RetryTool --> ToolSuccess
ToolSuccess --> StateUpdate[Update Tool Results]
end
StateUpdate --> MoreTools{More Tools?}
MoreTools -- Yes --> BeforeTool
MoreTools -- No --> FinalResp[Final Response]
UpdateState --> FinalResp
UserGuidance --> FinalResp
FinalResp --> UserOutput[User Output]
Our Phase 2 context management system features a sophisticated multi-component architecture that achieves 244x improvement in token utilization while maintaining context quality:
graph TD
subgraph "Context Manager Core"
CM[Context Manager] --> SP[Smart Prioritization]
CM --> CTC[Cross-Turn Correlation]
CM --> IS[Intelligent Summarization]
CM --> DCE[Dynamic Context Expansion]
end
subgraph "Smart Prioritization Engine"
SP --> CF[Content Factor]
SP --> RF[Recency Factor]
SP --> FF[Frequency Factor]
SP --> EF[Error Priority Factor]
SP --> CHF[Coherence Factor]
CF --> RS[Relevance Score]
RF --> RS
FF --> RS
EF --> RS
CHF --> RS
end
subgraph "Content Discovery"
DCE --> PF[Project Files]
DCE --> GH[Git History]
DCE --> DOC[Documentation]
DCE --> CFG[Configuration]
PF --> FC[File Classification]
GH --> FC
DOC --> FC
CFG --> FC
end
subgraph "Intelligent Processing"
IS --> CT1[Code Detection]
IS --> CT2[Error Detection]
IS --> CT3[Config Detection]
IS --> CT4[Documentation Detection]
IS --> KP[Keyword Preservation]
CT1 --> CS[Compressed Summary]
CT2 --> CS
CT3 --> CS
CT4 --> CS
KP --> CS
end
subgraph "State Integration"
ConvHistory[Conversation History] --> CM
CodeSnippets[Code Snippets] --> CM
ToolResults[Tool Results] --> CM
ProjectContext[Project Context] --> CM
end
subgraph "LLM Integration"
CM --> OptCtx[Optimized Context]
OptCtx --> TokenLimit{Within Token Limit?}
TokenLimit -- Yes --> LLMReq[LLM Request]
TokenLimit -- No --> Compress[Further Compression]
Compress --> OptCtx
LLMReq --> LLM[Gemini LLM]
end
CTC --> TurnRel[Turn Relationships]
TurnRel --> CM
Our robust tool execution system includes comprehensive error handling, automatic retry capabilities, and safety-first design:
graph TD
ToolCall[Tool Call Request] --> SafetyCheck[Safety Check]
SafetyCheck --> Whitelisted{Whitelisted?}
Whitelisted -- Yes --> DirectExec[Direct Execution]
Whitelisted -- No --> ApprovalCheck{Approval Required?}
ApprovalCheck -- Yes --> UserApproval[Request User Approval]
ApprovalCheck -- No --> DirectExec
UserApproval --> Approved{User Approves?}
Approved -- No --> Denied[Execution Denied]
Approved -- Yes --> DirectExec
DirectExec --> ParseStrategy[Select Parsing Strategy]
subgraph "Multi-Strategy Execution"
ParseStrategy --> Shlex[1. shlex.split]
Shlex --> ShlexResult{Success?}
ShlexResult -- No --> Shell[2. shell=True]
ShlexResult -- Yes --> Success[Execution Success]
Shell --> ShellResult{Success?}
ShellResult -- No --> SimpleSplit[3. Simple Split]
ShellResult -- Yes --> Success
SimpleSplit --> SimpleResult{Success?}
SimpleResult -- Yes --> Success
SimpleResult -- No --> AllFailed[All Strategies Failed]
end
Success --> ResultProcess[Process Result]
AllFailed --> ErrorAnalysis[Error Pattern Analysis]
subgraph "Error Recovery"
ErrorAnalysis --> ErrorType{Error Type}
ErrorType -- Parsing --> QuoteError[Quote/Parsing Error]
ErrorType -- Command Not Found --> MissingCmd[Missing Command]
ErrorType -- Timeout --> TimeoutError[Timeout Error]
ErrorType -- Permission --> PermError[Permission Error]
QuoteError --> RetryTool[execute_vetted_shell_command_with_retry]
MissingCmd --> InstallGuide[Installation Guidance]
TimeoutError --> TimeoutSuggestion[Timeout/Splitting Suggestions]
PermError --> PermissionGuide[Permission Fix Guidance]
RetryTool --> AltStrategies[Try Alternative Formats]
AltStrategies --> AltResult{Alternative Success?}
AltResult -- Yes --> Success
AltResult -- No --> ManualSuggestions[Manual Intervention Suggestions]
end
ResultProcess --> UpdateContext[Update Context State]
InstallGuide --> UserGuidance[Enhanced User Guidance]
TimeoutSuggestion --> UserGuidance
PermissionGuide --> UserGuidance
ManualSuggestions --> UserGuidance
Denied --> UserGuidance
UpdateContext --> Complete[Tool Execution Complete]
UserGuidance --> Complete
A key feature of this DevOps agent is its ability to understand and interact with codebases through Retrieval-Augmented Generation:
graph TD
U[User Input Query] --> DA{DevOps Agent}
DA -- Understand auth module --> RCT{retrieve_code_context_tool};
RCT -- Query --> VDB[(Vector Database - Indexed Code)];
VDB -- Relevant Code Chunks --> RCT;
RCT -- Code Snippets --> DA;
DA -- Combines snippets with LLM reasoning --> LR[LLM Response];
LR --> O[Agent provides explanation based on code];
subgraph "Initial Indexing (One-time or on update)"
CI[Codebase Files] --> IDT{index_directory_tool};
IDT --> VDB;
end
index_directory_tool
: This tool is used to scan a specified directory (e.g., a Git repository). It processes supported file types, breaks them into manageable chunks, generates vector embeddings for these chunks, and stores them in a vector database (ChromaDB). This creates a semantic index of the codebase.retrieve_code_context_tool
: When the agent needs to understand a part of the codebase to answer a question or perform a task, it uses this tool. It takes a natural language query, converts it to an embedding, and searches the vector database for the most similar (relevant) code chunks.
This RAG (Retrieval Augmented Generation) approach allows the agent to ground its responses and actions in the actual content of the codebase, leading to more accurate and context-aware assistance.
Note: To ensure the codebase understanding remains accurate, the indexed directory should be re-indexed using index_directory_tool
with force_reindex=True
after any significant code modifications.
Managing token usage is essential for efficient and cost-effective interactions with Large Language Models. The DevOps Agent implements several strategies to handle this:
- Dynamic Token Limit Determination: The agent attempts to dynamically fetch the actual token limit for the configured LLM model using the LLM client's capabilities. If this fails, it falls back to predefined limits based on common model types (e.g., Gemini Flash, Gemini Pro).
- Token Usage Transparency: For each model response, the agent displays detailed token usage statistics (prompt, candidate, and total tokens) using the
ui_utils.display_model_usage
function, providing users with insight into the cost of interactions. - Context Token Counting: The
context_management/context_manager.py
component is designed to accurately count tokens for the conversation history and injected context. It includes logic to utilize native LLM client counting methods or thetiktoken
library if available. - Context Optimization: The context management logic aims to optimize the information sent to the LLM to stay within token limits while retaining relevant conversation history and code snippets. This now primarily leverages the
context.state
mechanism provided by the ADK for storing and retrieving this information.
The goal is to ensure token usage is transparent, context is managed effectively to avoid exceeding limits, and the most accurate available counting methods are utilized.
graph TD
subgraph "Token Limit Determination"
Agent[DevOps Agent] --> TLD[Determine Token Limit]
TLD --> ClientAPI{LLM Client API Available?}
ClientAPI -- Yes --> DynamicLimit[Get Dynamic Limit]
ClientAPI -- No --> FallbackLimit[Use Model-Specific Fallback]
DynamicLimit --> TokenLimit[Actual Token Limit]
FallbackLimit --> TokenLimit
end
subgraph "Context Assembly & Optimization"
TokenLimit --> CM[Context Manager]
CM --> StateSync[Sync with ADK State]
StateSync --> Prioritize[Smart Prioritization]
Prioritize --> Correlate[Cross-Turn Correlation]
Correlate --> Summarize[Intelligent Summarization]
Summarize --> Expand[Dynamic Context Expansion]
Expand --> OptContext[Optimized Context]
end
subgraph "Token Counting & Validation"
OptContext --> CountTokens[Count Context Tokens]
CountTokens --> CountMethod{Counting Method}
CountMethod -- LLM Client --> AccurateCount[Native API Count]
CountMethod -- tiktoken --> TiktokenCount[tiktoken Count]
CountMethod -- Fallback --> EstimateCount[Character/4 Estimate]
AccurateCount --> TotalTokens[Total Token Count]
TiktokenCount --> TotalTokens
EstimateCount --> TotalTokens
end
subgraph "Context Optimization Loop"
TotalTokens --> WithinLimit{Within Token Limit?}
WithinLimit -- No --> Compress[Further Compression]
Compress --> OptContext
WithinLimit -- Yes --> LLMRequest[Inject into LLM Request]
end
subgraph "Response Processing"
LLMRequest --> LLM[Gemini LLM]
LLM --> Response[LLM Response]
Response --> Usage{Has Usage Metadata?}
Usage -- Yes --> ExtractUsage[Extract Token Usage]
Usage -- No --> NoUsage[No Usage Data]
ExtractUsage --> DisplayUsage[Display to User]
NoUsage --> DisplayUsage
DisplayUsage --> Console[Rich Console Output]
end
subgraph "Performance Analytics"
TotalTokens --> Utilization[Calculate Utilization %]
TokenLimit --> Utilization
Utilization --> LowUtil{< 20% Utilization?}
LowUtil -- Yes --> Warning[Low Utilization Warning]
LowUtil -- No --> OptimalUtil[Optimal Utilization]
Warning --> Analytics[Log Analytics]
OptimalUtil --> Analytics
end
The DevOps Agent includes an interactive planning phase to improve collaboration and the quality of output for complex tasks. This workflow is triggered for requests deemed sufficiently complex or when the user explicitly asks for a plan.
Workflow Steps:
- Task Assessment: Upon receiving a user request, the agent assesses its complexity to determine if a planning phase is beneficial.
- Plan Proposal: If planning is needed, the agent uses the LLM to generate a detailed, multi-step plan outlining the proposed approach to fulfill the request.
- User Review: The agent presents the generated plan to the user.
- Approval or Refinement: The user can review the plan and either approve it to proceed or provide feedback for refinement. The agent can iterate on the plan based on user feedback.
- Implementation: Once the plan is approved by the user, the agent proceeds with executing the steps outlined in the plan, leveraging its tools and context management.
This interactive approach ensures that the agent and the user are aligned on the strategy before significant work is performed, reducing rework and improving the final outcome.
graph TD
User --> Agent;
Agent --> Planning{Planning Needed?};
Planning -- Yes --> ProposePlan[Propose Plan];
ProposePlan --> User[Review Plan];
User --> Agent[Approve Plan];
Agent -- Plan Approved --> ContextMgt[Context Management];
Planning -- No --> ContextMgt;
ContextMgt --> LLM[LLM];
LLM -- Tool Calls --> Agent[Execute Tools];
Agent[Execute Tools] --> Tools[Tools];
Tools --> Agent[Process Tool Output];
Agent[Process Tool Output] --> ContextMgt;
LLM -- Response --> User;
subgraph Agent Components
Planning
ContextMgt
Tools
end
Explanation:
- User Input: The user interacts with the agent, typically via the ADK CLI (
adk run
) or an API endpoint if deployed. - Agent Decision: The agent determines if a planning step is needed based on the complexity of the task.
- Propose Plan: If planning is needed, the agent generates a detailed plan.
- Review Plan: The user reviews the proposed plan.
- Approve Plan: The user approves the plan.
- Context Management: The agent prepares the context for the LLM, including relevant code snippets and tool outputs.
- LLM: The LLM processes the input, "thinks" about the request, and decides if a tool needs to be used. It might select one or more tools from the agent's toolset.
- Tool Invocation: If a tool is selected, the
LlmAgent
invokes the corresponding Python function (e.g.,read_file_content
,execute_vetted_shell_command
). - Tool Output: The tool executes and returns its output to the
LlmAgent
. - Process Tool Output: The agent processes the tool output and integrates it with the context.
- LLM Response Generation: The agent sends the processed output back to the LLM, which then formulates the final response to the user.
- User Output: The ADK framework delivers the agent's response to the user.
sequenceDiagram
participant User
participant ADK
participant Agent
participant Planning
participant Context
participant LLM
participant Tools
User->>ADK: Query
ADK->>Agent: Query
Agent->>Planning: Assess Need
alt Planning Needed
Planning->>LLM: Request Plan
LLM-->>Planning: Proposed Plan
Planning->>Agent: Forward Plan
Agent->>User: Present Plan
User->>Agent: Approve
Agent->>Planning: Approval Status
end
Agent->>Context: Prepare Context
Context-->>Agent: Optimized Context
Agent->>LLM: Process Request
LLM-->>Agent: Thought Selection
Agent->>Tools: Invoke Tool
Tools-->>Agent: Tool Output
Agent->>Context: Update Context
Agent->>LLM: Generate Response
LLM-->>Agent: Final Response
Agent->>ADK: Send Output
ADK-->>User: Display Output
Telemetry: Observability is now disabled by default for clean output! No configuration needed.
# Clean output by default - just run the agent
echo "hi" | uv run agent run agents.devops
# Test with the provided script
./scripts/test_clean_output.sh
If you need observability, enable it explicitly:
# Enable full observability when needed
export DEVOPS_AGENT_OBSERVABILITY_ENABLE=true
echo "hi" | uv run agent run agents.devops
Advanced Configuration: For more observability options, see Observability Configuration