An AI-powered plugin for Caldera that orchestrates long-running LLM workflows to automatically create adversary emulation abilities and plan operations. Optionally enriches workflows with Retrieval-Augmented Generation (RAG) using Cyber Threat Intelligence (CTI) from STIX JSON files. All executions are tracked via MLflow for full observability into LLM reasoning and tool usage.
- LLM Ability Factory: Generate custom Caldera abilities from natural language descriptions
- LLM Operation Planner: Create and execute complete adversary emulation operations
- CTI Integration: Enhance abilities with real-world threat intelligence from STIX bundles
- MLflow Tracking: Full observability of LLM reasoning, tool calls, and execution trajectory
- Flexible Model Support: Works with most LLM providers (OpenAI, Anthropic, etc.)
- Run History: Browse and search all historical executions with full details
From the Caldera root directory:
python3 server.py --insecureThe MCP plugin automatically starts MLflow on port 5000 during Caldera initialization.
Navigate to the Caldera web interface and select the MCP plugin from the sidebar.
In the Global Model Configuration panel:
- Enter your API key (required)
- Select your model (default: gpt-4o)
- Adjust temperature and max_tokens as needed
- Set max tool calls for ReAct iterations (default: 5)
LLM Ability Factory: Create specific abilities
- Example: "Create a Windows ability that dumps credentials using PowerShell"
LLM Operation Planner: Plan and execute operations
- Example: "Execute a ransomware simulation on Windows agents"
- Enter your prompt in natural language
- (Optional) Select STIX CTI files to enhance with threat intelligence
- Click Execute
- Watch real-time progress via MLflow stages and reasoning
- View results and created abilities/operations
Frontend (Vue.js)
mcp.vue: Main landing page with navigationlocal_mcp_ability_factory.vue: Ability creation interfacepublic_mcp_ability_factory.vue: Public ability interfacemcp_history.vue: Historical run browsermcp_extension_guide.vue: Developer extension guide
Backend (Python)
mcp_api.py: aiohttp API routesmcp_svc.py: Service orchestration layermcp_factory_client.py: Ability factory DSPy clientmcp_planner_client.py: Operation planner DSPy clientmcp_server.py: MCP tool server exposing Caldera APIfactory.py: Command generation DSPy modulerag.py: STIX CTI retrieval service
Integration
hook.py: Plugin initialization and MLflow startup
Edit conf/default.yml to set default LLM configuration:
llm:
model: gpt-4o
api_key: YOUR_API_KEY
offline: true
use_mock: false
factory:
model: gpt-4o
api_key: YOUR_API_KEY
temperature: 0.4Note: Frontend model configuration overrides these defaults per run.
When using CTI enhancement:
- Embedding Model: Default
openai/text-embedding-3-small - Top-K Retrieval: Default 5 objects (configurable via UI)
- STIX File Location: Upload files via UI, stored in
data/directory
- Navigate to MCP → Ability Factory or Planner
- In the RAG Configuration section, click Upload STIX File
- Select your STIX JSON bundle(s)
- Files are stored in
plugins/mcp/data/
- In the RAG Configuration panel, select which STIX files to use
- (Optional) Adjust embedding model and top-K retrieval
- Execute your task normally
The LLM will receive relevant CTI context based on your prompt, including:
- Attack patterns and techniques
- Malware and tool descriptions
- Threat actor TTPs
- Campaign information
User Prompt → RAG Search (Semantic) → Top-K CTI Objects Retrieved
↓
Detailed Context for Top 3 Objects
↓
Formatted CTI Context String
↓
LLM Receives Task + CTI Context
↓
Creates CTI-Informed Abilities/Operations
Open your browser to: http://localhost:5000
Navigate to Experiments → Traces to view:
- Run status and stages
- LLM chain of thought (
thought_0,thought_1, etc.) - Tool calls and arguments (
tool_name_N,tool_args_N) - RAG retrieval steps (when CTI is used)
- Final results and reasoning
Status Tags:
status: running, complete, failedstage: Current execution phasereasoning: LLM's final reasoning summaryprocess_result: Summary of what was created
RAG Tags (when CTI is enabled):
rag_retrieval_step_N: RAG retrieval processrag_retrieved_object_N: Names of CTI objects retrievedcti_context_preview: First 1000 chars of CTI sent to LLMcti_context_length: Total CTI context size
LLM Trajectory:
thought_N: LLM reasoning at each stepobservation_N: Tool execution resultstool_name_N: Tool that was calledtool_args_N: Arguments passed to tool
See the Extend & Customize guide in the UI for detailed instructions on:
- Creating custom DSPy clients
- Adding new MCP tools
- Building custom workflows
- Integrating with the service layer
Example use cases:
- Threat Hunter: Analyze adversary profiles and generate detection rules
- Operation Optimizer: Review completed operations and suggest improvements
- Campaign Builder: Create multi-stage campaigns from threat actor profiles
# Test factory client (requires running Caldera)
cd app
python mcp_factory_client.py
# Test planner client
python mcp_planner_client.py
# Test MCP server tools
python mcp_server.pyplugins/mcp/
├── app/ # Python backend
│ ├── mcp_api.py # API routes
│ ├── mcp_svc.py # Service layer
│ ├── mcp_factory_client.py
│ ├── mcp_planner_client.py
│ ├── mcp_server.py # MCP tool server
│ ├── factory.py # Command generation
│ └── rag.py # CTI retrieval
├── gui/views/ # Vue frontend
│ ├── mcp.vue
│ ├── local_mcp_ability_factory.vue
│ ├── public_mcp_ability_factory.vue
│ ├── mcp_history.vue
│ └── mcp_extension_guide.vue
├── conf/default.yml # Default configuration
├── data/ # STIX JSON files
├── hook.py # Plugin initialization
└── README.md
- dspy: LLM orchestration framework with ReAct pattern
- mcp: Model Context Protocol SDK for tool servers
- mlflow: Experiment tracking and tracing
- aiohttp: Async web framework
- psutil: Process management for MLflow server
- requests: HTTP client for Caldera API
Error: "API key is required but not provided"
Solution: Enter your API key in the Global Model Configuration panel before executing.
Error: Cannot access http://localhost:5000
Solution: Check Caldera logs for MLflow startup messages. The plugin automatically kills processes on port 5000 and starts MLflow during initialization.
Error: "RAG service not initialized" or embedding errors
Solution:
- Verify STIX files are valid JSON
- Ensure API key has access to embedding models
- Check MLflow logs for detailed error messages
Error: Tools fail to initialize or execute
Solution:
- Verify Caldera API is accessible at
http://localhost:8888/api/v2/ - Check MCP server subprocess logs for environment issues
- Ensure PYTHONPATH includes venv packages
Check MLflow UI for:
- Full trajectory of tool calls
- Error messages in
errorparam - Traceback in
tracebackparam - Stage where failure occurred
For bugs and feature requests:
- Check MLflow traces for detailed execution information
- Review Caldera logs with
[MCP]prefix - Consult the in-app Extension Guide for development questions
Part of the Caldera project. See main Caldera repository for license information.