Skip to content

Copilot CLI should retry on transient CAPIError 400 Bad Request during agentic workflow execution #25313

@pholleran

Description

@pholleran

Summary

When running the Copilot CLI inside a gh-aw agentic workflow, the CLI fails with a non-retried CAPIError: 400 Bad Request from the Copilot inference API (CAPI) mid-session. The CLI (or the gh-aw execution wrapper) should implement retry logic for transient 400 errors that occur after the session has already completed successful turns.

Reproduction

  • Workflow: repo-assist.lock.yml (Repo Assist)
  • Repository: octodemo/octocat_supply-glorious-meme
  • Failed run: https://github.com/octodemo/octocat_supply-glorious-meme/actions/runs/24136873717/job/70427441392
  • Request ID: C818:3ED713:19D401B:1C446B7:69D653CA
  • Trigger: workflow_dispatch on main branch
  • CLI invocation: /usr/local/bin/copilot --add-dir ... --prompt "$(cat /tmp/gh-aw/aw-prompts/prompt.txt)"
  • Model: claude-sonnet-4.6 (444.6k input, 3.2k output, 373.6k cached)
  • Date: 2026-04-08

Error

Execution failed: CAPIError: 400 400 Bad Request
 (Request ID: C818:3ED713:19D401B:1C446B7:69D653CA)

The CLI had been running normally for ~3.5 minutes — completing 8+ tool calls (reading files, listing issues, exploring the codebase) — when CAPI returned this 400 error. The CLI exited with code 1, failing the workflow.

Expected Behavior

A 400 that occurs mid-conversation (after multiple successful inference turns) is likely transient rather than a genuinely malformed request. The CLI should:

  1. Detect CAPIError: 400 during an active session (after at least one successful turn)
  2. Retry the failed inference request with exponential backoff (e.g., 3-5 attempts)
  3. Only fail the session if retries are exhausted

Context

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions