Skip to content

feat(harness): OpenCode support with schema retry, error preservation, and project_dir routing#220

Open
santoshkumarradha wants to merge 8 commits intomainfrom
feat/harness-opencode-combined
Open

feat(harness): OpenCode support with schema retry, error preservation, and project_dir routing#220
santoshkumarradha wants to merge 8 commits intomainfrom
feat/harness-opencode-combined

Conversation

@santoshkumarradha
Copy link
Member

@santoshkumarradha santoshkumarradha commented Mar 4, 2026

Summary

Combines all post-merge harness improvements into a single PR. Supersedes #218 and #219.

What's in this PR

  1. OpenCode provider rewrite — auto-managed opencode serve + --attach pattern to bypass the "Session not found" bug in opencode v1.2.10–v1.2.16 (upstream issue)
  2. Schema validation retry loop — retries up to schema_max_retries times with diagnostic follow-up prompts and Claude session continuity
  3. Provider error preservation — surfaces real provider failures instead of masking them behind generic "Schema validation failed" errors
  4. project_dir routing — new HarnessConfig.project_dir field so coding agents explore the target repo instead of a temp working directory
  5. Output failure diagnosisdiagnose_output_failure() classifies failures into specific categories (file missing, empty, invalid JSON, schema mismatch with field-level diff)
  6. Enhanced follow-up prompts — includes schema file reference and explicit JSON rewrite instructions for the retry loop
  7. Claude session continuity — on schema retry, passes resume=session_id so Claude continues the same conversation
  8. Complex JSON schema test suite — 5 escalating schema complexity levels tested live with both Claude Code and Codex

⚠️ Technical Debt: OpenCode Serve Workaround

~83% of the code in this PR (941 of 1,131 lines) exists because of OpenCode bugs and limitations.

The serve+attach pattern in opencode.py is a temporary workaround for opencode#13851opencode run returns "Session not found" in headless mode. This workaround:

  • Spawns a long-lived opencode serve process on a random port (singleton, shared across provider instances)
  • Routes all calls through opencode run --attach <url>
  • Manages process lifecycle with async locks, health-check polling, and atexit cleanup

This is architecturally concerning because each harness call runs inside a FastAPI reasoner endpoint. Having the provider auto-spawn and manage background server processes within a web server process is fragile — it introduces process lifecycle coupling, port conflicts, and cleanup edge cases that don't belong in a request handler.

Once upstream fixes headless mode, this should be replaced with a simple opencode run call (matching the Codex/Gemini provider pattern). See tracking issue: #221

OpenCode-specific code inventory

Location Lines Purpose
opencode.py (serve lifecycle) ~160 Bug workaround: serve+attach, port allocation, locks, cleanup
_runner.py (project_dir routing) ~17 OpenCode's Write tool can't reach files outside --dir
types.py (project_dir, opencode_server) ~17 Config fields for above
_factory.py (server_url passthrough) ~4 Wiring
Total OpenCode-only ~198 17% of changes

Generic improvements (benefit all providers)

Location Lines Purpose
_runner.py (schema retry loop) ~115 _handle_schema_with_retry() replaces single-shot handler
_schema.py (diagnosis + followup) ~66 diagnose_output_failure(), enhanced build_followup_prompt()
_runner.py (error preservation) ~17 Provider errors no longer masked by "Schema validation failed"
claude.py (resume + logging) ~10 Session continuity for retry, error logging
Total generic ~208 18% of changes

Test code

Location Lines Purpose
debug_complex_json.py 743 5 escalating schema levels, manual retry test mode

Files Changed (7 files, +1131/-37)

File What Changed
sdk/python/agentfield/harness/_runner.py Schema retry loop, project_dir output routing, metrics accumulation, provider error context
sdk/python/agentfield/harness/_schema.py diagnose_output_failure(), enhanced build_followup_prompt() with schema file refs
sdk/python/agentfield/harness/providers/opencode.py Complete rewrite: serve+attach pattern, auto-managed singleton serve process, --dir and system prompt support
sdk/python/agentfield/harness/providers/claude.py resume_session_id support, message counting, error logging
sdk/python/agentfield/harness/providers/_factory.py Pass server_url to OpenCodeProvider
sdk/python/agentfield/types.py New fields: project_dir, opencode_server on HarnessConfig
sdk/python/tests/debug_complex_json.py New: standalone test script with 5 schema levels + manual retry test mode

Verification

  • LSP diagnostics clean (error severity) for all 7 changed files
  • Tested live with Claude Code and Codex (all schema levels pass)
  • Tested live with OpenCode via openrouter/moonshotai/kimi-k2.5 model

Closes #217. Supersedes #218 and #219.

…mpts

Add diagnose_output_failure() that classifies validation failures into
specific categories: file missing, empty, invalid JSON, or schema mismatch
with field-level diff. Enhance build_followup_prompt() to include schema
file references and explicit rewrite instructions for the retry loop.
Replace single-shot _handle_schema_output() with _handle_schema_with_retry()
that retries up to schema_max_retries times (default 2) when JSON validation
fails. Each retry:
  - Diagnoses the specific failure via diagnose_output_failure()
  - Sends a follow-up prompt to the agent with error context
  - For Claude: passes resume=session_id to continue the conversation
  - For CLI providers: fresh call with the follow-up prompt
  - Accumulates cost, turns, and messages across all attempts

This activates the previously dead-code build_followup_prompt() from _schema.py
and adds resume_session_id support to the Claude Code provider.
Standalone script exercising the harness with 5 escalating schema levels:
  - simple (2 fields), medium (lists + optionals), complex (13 nested fields),
    deeply_nested (recursive TreeNode), massive (>4K tokens, file-based path)
Tested live with both claude-code and codex providers — all levels pass.
Includes manual retry test mode (--retry-test) to exercise the new retry loop.
- Rewrite opencode.py: auto-managed serve+attach pattern to bypass
  opencode v1.2.10-v1.2.16 'Session not found' bug
- Add project_dir field to HarnessConfig (types.py) so coding agents
  explore the target repo instead of a temp working directory
- Add output file placement inside project_dir (runner) so sandboxed
  Write tool can reach the output JSON
- Pass server_url to OpenCodeProvider via factory
- Clean up debug prints from runner and claude provider
- Verified working with openrouter/moonshotai/kimi-k2.5 model
@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2026

Performance

SDK Memory Δ Latency Δ Tests Status
Python 9.3 KB +4% 0.39 µs +11%

✓ No regressions detected

@santoshkumarradha
Copy link
Member Author

Tracking issue for simplifying the OpenCode serve+attach workaround once upstream fixes headless mode: #221

Tests now pass server_url to skip auto-serve lifecycle in CI where
opencode binary is not installed. Asserts updated to match --attach
command structure.
@santoshkumarradha santoshkumarradha requested review from a team and AbirAbbas as code owners March 4, 2026 15:50
opencode run --attach loses auto-approve because the serve process
treats attached sessions as interactive, causing permission prompts
to hang forever when the model tries to write files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

harness: surface provider error when schema output file is missing

1 participant