Skip to content

Feature Request: Custom compaction threshold to trigger earlier #10017

@openAgi2

Description

@openAgi2

Problem

Currently, auto-compaction only triggers when token count exceeds the model's context limit (count > usable in compaction.ts):

export async function isOverflow(input: { tokens; model }) {
  // ...
  const count = input.tokens.input + input.tokens.cache.read + input.tokens.output
  const usable = input.model.limit.input || context - output
  return count > usable  // Only triggers at the limit
}

For models with large context windows (e.g., Claude Opus/Sonnet with 200K), this means:

  • Users must accumulate ~200K tokens before auto-compaction kicks in
  • By that point, API responses are already extremely slow (10+ minutes of waiting)
  • The "auto" compaction effectively becomes useless for preventing slowdowns

Proposed Solution

Add a configurable threshold option to trigger compaction earlier:

{
  "compaction": {
    "auto": true,
    "prune": true,
    "threshold": 50000  // Trigger compaction when input tokens exceed this value
  }
}

Implementation would be straightforward - modify isOverflow() to check against the threshold:

export async function isOverflow(input: { tokens; model }) {
  const config = await Config.get()
  if (config.compaction?.auto === false) return false
  
  const count = input.tokens.input + input.tokens.cache.read + input.tokens.output
  
  // Check custom threshold first
  if (config.compaction?.threshold && count > config.compaction.threshold) {
    return true
  }
  
  // Fall back to model limit check
  const context = input.model.limit.context
  if (context === 0) return false
  const output = Math.min(input.model.limit.output, SessionPrompt.OUTPUT_TOKEN_MAX) || SessionPrompt.OUTPUT_TOKEN_MAX
  const usable = input.model.limit.input || context - output
  return count > usable
}

Use Case

After 20-30 conversation turns with many tool calls, sessions become painfully slow. Users currently must:

  1. Notice the slowdown (often too late)
  2. Manually run /compact

With a configurable threshold, users could set a reasonable limit (e.g., 50K-80K tokens) to maintain responsive sessions automatically.

Alternatives Considered

  • Manual /compact: Works but requires user vigilance; easy to forget until it's too late
  • Lower the percentage of context used: Less flexible than an absolute threshold

Additional Context

Related code: packages/opencode/src/session/compaction.ts

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions