Skip to content

[Bug]: Double compaction race — preemptive-compaction and anthropic-context-window-limit-recovery fire independently, causing "Auto Compact Failed" #1752

@ChoKhoOu

Description

@ChoKhoOu

Bug Description

Session compaction fires twice in succession, then displays "Auto Compact Failed — All recovery attempts failed." Additionally, the preemptive-compaction hook (78% threshold) appears to never trigger — compaction only happens when the hard limit is actually hit via API error.

Steps to Reproduce

  1. Work in a long session until context approaches the configured limit
  2. Observe that no compaction happens at ~78% of limit — the preemptive threshold never fires
  3. Context continues growing until it hits the actual limit
  4. Provider returns a token limit error → anthropic-context-window-limit-recovery fires → compacts [1st time]
  5. After first compaction, context may still be too large → another API error → another compact attempt [2nd time]
  6. Retry strategy exhausts maxAttempts: 2 → "Auto Compact Failed"

Root Cause: Two Separate Bugs

Bug A: preemptive-compaction is effectively non-functional

The hook runs on tool.execute.after and reads lastAssistant.tokens to calculate usage ratio. However, this data may be stale due to a timing issue:

Timeline:
1. LLM generates tool_use blocks (step begins)
2. Tool executes
3. tool.execute.after fires ← hook reads tokens HERE
4. Tool result sent back to LLM
5. LLM processes response
6. finish-step fires → tokens are OVERWRITTEN on assistant message (processor.ts line 244)

At step 3, the tokens on the current assistant message reflect a previous step's data (or are 0 for the first tool call in a new message), because finish-step hasn't fired yet. The hook reads stale/zero token counts, calculates usageRatio ≈ 0, and returns early without compacting.

Additionally, the hook has a silent catch {} block (line 87) that swallows ALL errors — if the API call fails or returns unexpected data, the hook silently does nothing with no logging.

Relevant code (src/hooks/preemptive-compaction.ts:69-73):

const lastTokens = lastAssistant.tokens
const totalInputTokens = (lastTokens?.input ?? 0) + (lastTokens?.cache?.read ?? 0)
// If tokens is undefined or from a previous step, totalInputTokens = 0
// usageRatio = 0 / 200000 = 0, which is < 0.78 → returns early
const usageRatio = totalInputTokens / actualLimit
if (usageRatio < PREEMPTIVE_COMPACTION_THRESHOLD) return

Bug B: anthropic-context-window-limit-recovery double-fires on the same error

When a token limit error occurs, two event handlers within the same hook can both trigger executeCompact():

  1. session.error handler (line 48-88): Adds to pendingCompact + calls executeCompact() after a 300ms setTimeout
  2. session.idle handler (line 111-147): Checks pendingCompact → calls executeCompact() again

The session goes idle shortly after the error (because the error stopped processing). If the session.idle handler fires before the 300ms setTimeout callback, it sees pendingCompact has the session and calls executeCompact(). Then the setTimeout also fires. The compactionInProgress guard (executor.ts line 25) may prevent true parallel execution, but the retry counter increments for both paths, exhausting maxAttempts: 2 twice as fast.

Combined effect:

1. preemptive-compaction NEVER fires (token data timing issue)
2. Context grows unchecked to the hard limit
3. API returns token limit error
4. session.error handler → pendingCompact + setTimeout(executeCompact, 300ms)
5. session goes idle → session.idle handler → sees pendingCompact → executeCompact() [1st compact]
6. 300ms later → setTimeout fires → executeCompact() [2nd compact or retry]
7. If compaction doesn't reduce context enough → next API call fails again → repeat
8. maxAttempts (2) exhausted → "Auto Compact Failed"

Expected Behavior

  1. preemptive-compaction should reliably trigger at 78% of the context limit
  2. Only ONE compaction should execute per token-limit error, not two
  3. After a successful compaction, the error-recovery path should detect it and skip

Suggested Fix

For Bug A (preemptive-compaction not triggering):

  • Read token data from StepFinishPart parts on the last assistant message instead of the message-level tokens (which may be stale at tool.execute.after time)
  • OR move the check to a different event (e.g., message.updated or session.idle) where tokens are guaranteed to be up-to-date
  • Add logging in the catch block instead of silently swallowing errors

For Bug B (double compaction):

  • Deduplicate between session.error and session.idle paths — if executeCompact was already called/scheduled from session.error, the session.idle handler should skip
  • OR remove the session.errorsetTimeout(executeCompact) path entirely and only compact on session.idle (which already has the summary === true guard at line 120)

Environment

  • oh-my-opencode: v3.4.0+ (all versions with both hooks enabled)
  • Model: Any provider (Anthropic, OpenAI, Google, xAI — the error parser uses generic keyword matching, not provider-specific)
  • OpenCode: latest

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions