ClawCode includes an independent research subcommand group for multi-phase investigation workflows that reuse the main LLM runtime, session store, and plugin hook system.
- CLI / TUI (
clawcode research …) loads settings and builds an app context viacreate_app. ResearchOrchestratorselects a workflow (deep,lit,audit,compare,replicate), registers tools, runs phase prompts throughLeadResearchAgent, and writes a combined Markdown summary under the output directory.- Middleware wraps each turn: memory recall, titles, loop detection, summarization, todo tracking (checkboxes synced to
.plans/<slug>-todos.md), and memory pipeline writes. - Tools (
research_web_search,research_paper_search,research_fetch_url,research_sandbox_exec) call shared web helpers (Firecrawl / Tavily / Parallel when configured) with DuckDuckGo Instant Answer fallback for search, arXiv (and optional Semantic Scholar) for papers, and HTTP + HTML strip fallback for URL fetch. - Sub-agents use
SubAgentExecutorwith an injected async runner that creates a session and reusesresearch_turn. - External backends: set
research.backend=externalandresearch.external_adapterto a name registered onAdapterRegistry. Third-party packages can expose adapters via theclawcode.research_adaptersentry-point group (see below).
CLI → ResearchApp → ResearchOrchestrator → LeadResearchAgent → Agent runtime
↓ ↓
ToolRegistry MiddlewareChain
SubAgentExecutor (memory, todos, …)
On the native path, ResearchOrchestrator builds a ToolRegistry (research-only handlers) and converts each entry to a BaseTool via clawcode/research/tools/bridge.py. LeadResearchAgent merges those adapters with the normal coder toolset from build_coder_runtime(..., style="research_mode"), so the model can call:
| Tool | Role |
|---|---|
research_web_search |
Web search (project web.* backends + DuckDuckGo fallback) |
research_paper_search |
arXiv / Semantic Scholar |
research_fetch_url |
Fetch page text |
research_sandbox_exec |
Shell in the research sandbox workspace |
Plugin hook ResearchToolRegistered still fires for registry metadata; the tools above are what the ReAct loop actually exposes.
Requires a configured provider (API key in .clawcode.json / env). Run a short workflow and confirm the transcript or logs show tool calls, e.g.:
clawcode research start "Rust async traits" --workflow audit -o ./outputs/research-smokeInspect the generated *-run-summary.md under -o for citations or tool-derived content.
clawcode research start "<topic>" --workflow deep|deepresearch|lit|audit|compare|replicate|peerreviewclawcode research audit <url-or-path>— same asstartwith workflowauditclawcode research interactive— minimal Textual UIclawcode research list-workflows— print built-in workflow ids (presets)clawcode research list-prompts— list Markdown template workflows (deepresearch,peerreview, …) with phases and suggested-wvaluesclawcode research start "…" --dry-run/research audit "…" --dry-run— load.clawcode.jsonand print research settings without calling the LLM
| Workflow | Typical use |
|---|---|
deep |
Broad multi-phase investigation (default, code-defined phases). |
deepresearch |
Template-driven phases from research/prompts/deepresearch.md (plan → research → verify → deliver). |
peerreview |
Template-driven peer-review style phases (peerreview.md: review → verify → deliver). |
lit |
Literature survey style phases. |
audit |
Inspect a URL, repo, or artifact. |
compare |
Side-by-side comparison of options or sources. |
replicate |
Steps toward reproducing a method or result. |
Nested objects in .clawcode.json (project root or clawcode/ package dir, same discovery order as the main app). A minimal template ships at the repo root .clawcode.json: fill in empty string fields with your keys.
{
"web": {
"backend": "firecrawl",
"firecrawl_api_key": "YOUR_FIRECRAWL_KEY",
"firecrawl_api_url": "",
"tavily_api_key": "YOUR_TAVILY_KEY",
"parallel_api_key": "YOUR_PARALLEL_KEY"
},
"research": {
"enabled": true,
"backend": "native",
"external_adapter": "",
"default_model": "",
"s2_api_key": "YOUR_SEMANTIC_SCHOLAR_KEY",
"workflows": {},
"subagents": {
"max_concurrent": 3,
"timeout_seconds": 300,
"builtin_agents": ["researcher", "reviewer", "writer", "verifier"]
},
"sandbox": {
"type": "local",
"docker_image": "clawcode-research-sandbox:latest",
"work_dir": ""
},
"memory": {
"enabled": true,
"storage_subdir": "research_memory"
},
"tools": {
"web_search_enabled": true,
"paper_search_enabled": true,
"sandbox_execute_enabled": true
}
}
}Field reference:
research.enabled— master toggleresearch.backend—native|externalresearch.external_adapter— registry name when backend isexternalresearch.default_model— optional model overrideresearch.s2_api_key— Semantic Scholar API key (optional; envS2_API_KEYalso supported)research.sandbox—local/docker/k8sworkspace forresearch_sandbox_execresearch.memory— episodic storage under the data directoryresearch.subagents— concurrency and timeouts forSubAgentExecutorresearch.tools— toggles for built-in research tools
Where to get API keys (optional)
| Service | Notes |
|---|---|
| Firecrawl | Set web.backend to firecrawl, fill firecrawl_api_key (and optional firecrawl_api_url for self-hosted). |
| Tavily | Set web.backend to tavily, fill tavily_api_key. |
| Parallel | Set web.backend to parallel, fill parallel_api_key. |
| Semantic Scholar API | Optional research.s2_api_key or env S2_API_KEY for higher rate limits on source=semantic_scholar. |
Web search / extract reuse the main stack:
- Set
web.backendtofirecrawl,tavily, orparalleland provide the corresponding keys in.clawcode.json(preferred) or environment variables used byweb_utils(see main ClawCode docs). - If those fail or return no results, research web search falls back to the DuckDuckGo Instant Answer JSON API (no key).
Paper search
- Default arXiv (
source=arxiv): no API key. - Semantic Scholar (
source=semantic_scholar): optional key as above.
See also config/research_config.yaml if present in your tree for additional field notes.
Pytest can drive ResearchOrchestrator end-to-end with your real provider (no mocks). Tests are skipped by default so CI stays offline.
Requirements
- Run from the
clawcodeproject directory (next topyproject.toml) soSettingsloads that directory’s.clawcode.json(discovery uses the process current working directory). - At least one
providers.*entry must be enabled with a non-emptyapi_key.
Commands (PowerShell)
cd path\to\clawcode
$env:CLAWCODE_RESEARCH_LIVE_TEST = "1"
py -3 -m pytest tests/test_research_live_llm.py -v --no-cov -m live_llm- First test:
auditworkflow (single phase, one realresearch_turn/Agent.run). - Optional second test: set also
$env:CLAWCODE_RESEARCH_LIVE_DEEP = "1"to rundeepresearchwith only the first template phase (plan) to limit time and cost.
These calls use the same stack as clawcode research start … (middleware, tools, session store). Expect minutes per test depending on model and latency.
Packages can register named adapters without editing ClawCode:
[project.entry-points."clawcode.research_adapters"]
my_backend = "my_package.adapters:MyResearchAdapter"Classes must subclass ExternalResearchAdapter. At orchestrator startup, ensure_entry_point_adapters_registered() loads entry points into AdapterRegistry.
Optional hook events (declare in plugin hooks config):
ResearchSessionStart/ResearchSessionEndResearchPhaseStart/ResearchPlanCreatedResearchVerificationStart/ResearchReportGeneratedResearchToolRegisteredResearchSubAgentStart/ResearchSubAgentComplete
Bundled Markdown skills live under skills/research/ for documentation and
onboarding; enable them via the normal ClawCode skills path if desired.
| Symptom | What to check |
|---|---|
unknown external backend |
Adapter name must match AdapterRegistry / entry point name; run with --dry-run to confirm research.external_adapter. |
| Web search always empty | Configure web.backend + API keys; check logs for Firecrawl/Tavily errors — DuckDuckGo fallback only returns Instant Answer style results. |
research_fetch_url returns stripped HTML |
Expected when Firecrawl/Tavily extract is unavailable; content comes from simple HTTP + tag stripping. |
| arXiv timeouts / blocks | Corporate networks may block export.arxiv.org; retry or use VPN. |
| Sub-agent failures | Runner is injected in ResearchOrchestrator; ensure session service and model are available (same as main agent). |
| Model never calls research tools | Confirm you are on native backend; provider must support tool calling. Phases like investigate explicitly ask for tools—use a capable model. |
From the repo / project root (with ClawCode installed and provider configured):
# List workflow presets
clawcode research list-workflows
# List Markdown prompt templates (deepresearch / peerreview / …)
clawcode research list-prompts
# Validate config only (no LLM)
clawcode research start "my topic" --dry-run
# Deep research (writes `<slug>-run-summary.md` under default or `-o`)
clawcode research start "quantum error correction" --workflow deep -o ./outputs/my-run
# Template-driven deep research (Markdown phases)
clawcode research start "quantum error correction" --workflow deepresearch -o ./outputs/dr-run
# Template-driven peer review
clawcode research start "Paper title or abstract" --workflow peerreview -o ./outputs/pr-run
# Literature-style run
clawcode research start "LLM agents survey" --workflow lit
# Audit a path or URL
clawcode research audit https://example.com/repo -o ./outputs/audit
# Interactive TUI
clawcode research interactiveOptional: -c /path/to/project sets working directory for settings resolution; -m <model> overrides the research default when wired through agent config.
Workflow modules live under clawcode/research/workflows/. Each exports phase lists (phases_deep, phases_audit, …). To add a preset, extend normalize_workflow / WORKFLOW_CHOICES and implement a phases_* builder consistent with existing patterns.