feat: improve compaction summary quality

## Problem

The current summarizer produces thin, low-quality summaries. In testing, a 142-message conversation produced a 474-char summary. The summarizer input is truncated crudely (tool results to 200 chars, 60k total cap, drop-the-middle strategy).

With observational memory carrying the structured facts, the summary's job becomes simpler — just provide narrative coherence. But even for that reduced role, the current implementation has issues:

- Summarizing at 80% capacity means the input is already huge and may exceed the summarizer's own context window
- Single-pass summarization of 100+ messages loses important narrative threads
- No quality validation of the summary output
- No option to use a different (faster/cheaper) model for summarization

## Improvements

### 1. Lower compaction threshold to 60%

Summarize earlier when the input is smaller and more manageable. A 60k token conversation produces better summaries than a 190k one. The trade-off is more frequent compaction, but each compaction is higher quality.

### 2. Aider-style split-head/preserve-tail

Instead of summarizing the entire conversation, split at ~50% and only summarize the older head. Recent messages stay verbatim. This preserves immediate context while compressing the past.

```
[system prompt] + [summary of older half] + [working memory block] + [recent messages verbatim]
```

Aider recurses up to depth=3 for very long conversations (summary of summaries).

### 3. Configurable summarizer model

New config option: `compaction.model` — defaults to the main model but can be set to a faster/cheaper model. Aider uses `weak_model` for summarization. For Ollama, this could be a smaller local model that handles summarization well (e.g., `qwen2.5:7b` even if the main model is `glm-5:cloud`).

### 4. Summary quality validation

After summarization, check that the summary mentions key entities from the conversation:
- File paths that were modified
- The current task/goal
- Recent errors or blockers

If the summary fails validation (missing key entities), fall back to keeping more recent messages verbatim rather than trusting the bad summary.

## Key files

- `src/agent/compaction.ts` — `summarizeConversation()`, `buildSummarizerInput()`
- `src/config/schema.ts` — CompactionSchema (add `model` field)

## Research references

- `docs/context-compaction-research.md` — Aider, Cline, opencode comparison
- **Aider**: split at 50%, summarize head, preserve tail, recursive up to depth=3, uses weak model
- **Cline**: 9-section structured summary prompt, file read deduplication (30%+ savings)
- **opencode**: whole-conversation single-pass (same as our current approach)

## Related

- Follows from #61 (compaction redesign)
- Benefits from observational memory (structured facts reduce pressure on summary quality)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: improve compaction summary quality #66

Problem

Improvements

1. Lower compaction threshold to 60%

2. Aider-style split-head/preserve-tail

3. Configurable summarizer model

4. Summary quality validation

Key files

Research references

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: improve compaction summary quality #66

Description

Problem

Improvements

1. Lower compaction threshold to 60%

2. Aider-style split-head/preserve-tail

3. Configurable summarizer model

4. Summary quality validation

Key files

Research references

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions