Skip to content

Feature Request: Configurable Context Compaction Threshold #11314

@WietRob

Description

@WietRob

OpenCode Feature Request: Configurable Context Compaction Threshold

Submitted by: @roberto_schmidt
Date: 2026-01-30
Priority: High
Category: Core Functionality / Configuration


Summary

Currently, OpenCode triggers context compaction at a hardcoded 75% threshold of a model's context window. This causes severe performance degradation for long-context models (Gemini, Claude, GPT-4) before compaction occurs, as these models begin losing coherence well before the 75% mark.

Request: Expose the compaction threshold as a user-configurable setting in opencode.json.


Problem Statement

Current Behavior

  • Compaction triggers at exactly 75% of the model's advertised context window
  • This is hardcoded in the OpenCode binary and cannot be changed
  • Example: Gemini with 1M context triggers compaction at 786k tokens

Impact

  • Gemini models: Performance degradation starts at ~30% (300k tokens), but compaction doesn't trigger until 75% (786k)
  • Claude models: Quality drops significantly after ~50% of context
  • Result: Users experience 2-3x slower responses, hallucinations, and poor code quality before automatic compaction kicks in

Proposed Solution

Add a compaction configuration section to opencode.json:

{
  "$schema": "https://opencode.ai/config.json",
  "compaction": {
    "threshold": 0.40,
    "strategy": "summarize",
    "preserveRecentMessages": 10,
    "preserveSystemPrompt": true
  },
  "provider": {
    ...
  }
}

Configuration Options

Option Type Default Description
threshold number (0.0-1.0) 0.75 Percentage of context window at which to trigger compaction
strategy string "summarize" Compaction strategy: "summarize", "truncate", "archive"
preserveRecentMessages number 10 Number of recent messages to always preserve
preserveSystemPrompt boolean true Always preserve the system prompt

Per-Model Overrides

Allow model-specific thresholds:

{
  "provider": {
    "google": {
      "models": {
        "gemini-1.5-pro": {
          "limit": { "context": 1048576 },
          "compaction": {
            "threshold": 0.30
          }
        }
      }
    }
  }
}

Use Cases

  1. VRP Optimization Work (CuraOps): Need compaction at 30% for Gemini to maintain constraint tracking accuracy
  2. Long Document Analysis: Users working with 500k+ token documents need early compaction to preserve coherence
  3. Multi-Session Coding: 8+ hour coding sessions accumulate context that degrades model performance

Alternative Solutions Considered

  1. Manual compaction: Users can trigger session.compact manually, but this interrupts workflow and requires constant monitoring
  2. Smaller context windows: Setting limit.context to lower values doesn't work - OpenCode uses the model's ACTUAL window, not the config value
  3. Different models: Switching to shorter-context models sacrifices capability

Implementation Notes

The compaction logic already exists in the OpenCode binary. This feature request is about exposing the hardcoded 0.75 value as a configuration parameter.

Relevant SDK types found:

compaction?: {
  auto?: boolean;     // Already exists
  prune?: boolean;    // Already exists
  threshold?: number; // REQUESTED: 0.0-1.0
}

Benefits

  1. Improved productivity: Models stay in high-performance zone longer
  2. Better code quality: Less context degradation = fewer bugs
  3. User control: Power users can tune for their specific workflows
  4. Backward compatible: Default remains 0.75, no breaking changes

Related Issues

  • Context window management for long-running sessions
  • Performance degradation in multi-step workflows
  • Memory/Neo4j integration for persistent context

Contact: roberto.schmidt@curaops.de
Willing to test: Yes, happy to validate beta implementations

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions