Skip to content

feat: improve compaction summary quality #66

@platypusrex

Description

@platypusrex

Problem

The current summarizer produces thin, low-quality summaries. In testing, a 142-message conversation produced a 474-char summary. The summarizer input is truncated crudely (tool results to 200 chars, 60k total cap, drop-the-middle strategy).

With observational memory carrying the structured facts, the summary's job becomes simpler — just provide narrative coherence. But even for that reduced role, the current implementation has issues:

  • Summarizing at 80% capacity means the input is already huge and may exceed the summarizer's own context window
  • Single-pass summarization of 100+ messages loses important narrative threads
  • No quality validation of the summary output
  • No option to use a different (faster/cheaper) model for summarization

Improvements

1. Lower compaction threshold to 60%

Summarize earlier when the input is smaller and more manageable. A 60k token conversation produces better summaries than a 190k one. The trade-off is more frequent compaction, but each compaction is higher quality.

2. Aider-style split-head/preserve-tail

Instead of summarizing the entire conversation, split at ~50% and only summarize the older head. Recent messages stay verbatim. This preserves immediate context while compressing the past.

[system prompt] + [summary of older half] + [working memory block] + [recent messages verbatim]

Aider recurses up to depth=3 for very long conversations (summary of summaries).

3. Configurable summarizer model

New config option: compaction.model — defaults to the main model but can be set to a faster/cheaper model. Aider uses weak_model for summarization. For Ollama, this could be a smaller local model that handles summarization well (e.g., qwen2.5:7b even if the main model is glm-5:cloud).

4. Summary quality validation

After summarization, check that the summary mentions key entities from the conversation:

  • File paths that were modified
  • The current task/goal
  • Recent errors or blockers

If the summary fails validation (missing key entities), fall back to keeping more recent messages verbatim rather than trusting the bad summary.

Key files

  • src/agent/compaction.tssummarizeConversation(), buildSummarizerInput()
  • src/config/schema.ts — CompactionSchema (add model field)

Research references

  • docs/context-compaction-research.md — Aider, Cline, opencode comparison
  • Aider: split at 50%, summarize head, preserve tail, recursive up to depth=3, uses weak model
  • Cline: 9-section structured summary prompt, file read deduplication (30%+ savings)
  • opencode: whole-conversation single-pass (same as our current approach)

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions