Skip to content

A metacognitive annotation system for Claude Code — structured transparency into AI reasoning, decisions, and self-assessment

License

Notifications You must be signed in to change notification settings

AIntelligentTech/agent-response-boxes

Agent Response Boxes

Metacognitive “Response Boxes” for AI coding agents: make hidden reasoning visible, improve collaboration, and enable optional cross-session learning via a local event store.

CI License: MIT Version

Supported agents: Claude Code • OpenCode • Windsurf (Cascade) • Cursor

Quick install

User-level install (recommended):

curl -sSL https://raw.githubusercontent.com/AIntelligentTech/agent-response-boxes/main/install.sh | bash

Project-level rules install only:

curl -sSL https://raw.githubusercontent.com/AIntelligentTech/agent-response-boxes/main/install.sh | bash -s -- --project

Contents

v0.7.2 highlights

  • Tier A parity fix: Cursor + Windsurf collectors now correctly extract box fields from canonical **Field:** value formatting (and accept **Field**: value)
  • Build-time outputs: repo ships a committed outputs/ tree generated by CACE (stable installs, offline-friendly)
  • Rename (pending): repository is being renamed from claude-response-boxes to agent-response-boxes (installer includes a temporary raw URL fallback)

Agent Compatibility

Feature Claude Code OpenCode Windsurf Cursor
Box Taxonomy (12 types) ✅ Full ✅ Full ✅ Full ✅ Full
Automatic Collection ✅ Hook ✅ Plugin ✅ Hook ✅ Hook
Automatic Injection ✅ Hook ✅ Plugin ⚠️ Manual (workflow/skill) ⚠️ Manual (skill)
/analyze-boxes Skill ✅ Native ✅ Native ⚠️ Reuse ⚠️ Reuse
Cross-Session Learning ✅ Full ✅ Full ⚠️ Partial ⚠️ Partial

Support Tiers:

  • Full (Claude Code, OpenCode): Automatic collection + injection + analysis
  • Enhanced (Windsurf): Automatic collection + manual injection workflow
  • Basic (Cursor): Automatic collection + manual injection skill

See Cross-Agent Compatibility Guide for detailed information.


What Are Response Boxes?

Response boxes are structured annotations that make AI reasoning transparent. Every significant choice, assumption, and judgment is explicitly documented using a consistent format.

⚖️ Choice ───────────────────────────────────────
**Selected:** Zod for schema validation
**Alternatives:** Yup, io-ts, manual validation
**Reasoning:** Better TypeScript inference, smaller bundle size
────────────────────────────────────────────────

🏁 Completion ───────────────────────────────────
**Request:** Add input validation to login form
**Completed:** Email + password validation with Zod schema
**Confidence:** 9/10
**Gaps:** No server-side validation added
**Improve:** Should have asked about existing validation patterns
────────────────────────────────────────────────

Why Use Response Boxes?

1. Transparency

Hidden reasoning leads to misalignment. When Claude chooses between libraries, makes assumptions about requirements, or decides on an approach, those decisions are now visible and reviewable.

2. Self-Reflection

This is within-session metacognition: Claude is forced to audit its own work in the same thread, while it can still correct course.

The completion box forces Claude to reassess its work before finishing: "Did I actually address the request? What gaps remain? How could I have done better?"

3. Cross-Session Learning

This is cross-session self-learning: high-value boxes (corrected assumptions, validated choices, warnings that proved right) are persisted as an event log and used as durable training signal.

Boxes become evidence. /analyze-boxes synthesizes them into reusable learnings and higher-level meta-learnings, which are then injected at the start of future sessions so Claude can avoid repeating the same mistakes.

4. Anti-Sycophancy

Anti-sycophancy is handled through a sophisticated internal protocol based on research (ELEPHANT framework, SMART, self-blinding studies). It operates during response generation to ensure honest, direct responses without unnecessary validation or praise. See rules/anti-sycophancy.md for the full protocol.


System Architecture

The system operates in three layers:

┌─────────────────────────────────────────────────────────────────────────────┐
│                         RESPONSE BOX SYSTEM                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  LAYER 1: PROMPT GUIDANCE                                                    │
│  ├── outputs/claude/.claude/output-styles/response-box.md  Active during sessions   │
│  ├── outputs/claude/.claude/rules/response-boxes.md        Complete specification   │
│  └── outputs/claude/.claude/config/claude-md-snippet.md    Minimal CLAUDE.md snippet│
│                                                                              │
│  LAYER 2: DATA PERSISTENCE (Hooks)                                           │
│  ├── SessionStart: inject-context.sh    Load prior learnings                │
│  └── SessionEnd: session-processor.sh  Collect and emit box events          │
│                                                                              │
│  LAYER 3: ANALYTICS                                                          │
│  └── outputs/claude/.claude/skills/analyze-boxes/SKILL.md  AI-powered analysis skill │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Metacognition Loop

Within-session (metacognition): During a single conversation, boxes act as local state. Claude uses them to surface assumptions/decisions as they happen and to self-correct before moving on. When applying something learned earlier in the same session, it records that application with a 🔄 Reflection box.

Cross-session (self-learning): Between conversations, Claude can't “remember” unless the system persists and reinjects context. This system does that by turning prior work into a three-tier knowledge model:

  • Boxes — Raw, turn-level evidence (what was chosen/assumed/warned).
  • Learnings — Synthesized patterns extracted from many boxes.
  • Meta-learnings — Higher-level principles that synthesize multiple learnings.

At session start, high-value boxes plus the most relevant learnings/meta-learnings are projected from the event store and injected as context, enabling Claude to apply past corrections and preferences immediately.


Box Types

Inline Boxes (at point of relevance)

Box When Fields
⚖️ Choice Selected between 2+ options Selected, Alternatives, Reasoning
🎯 Decision Made a judgment call What, Reasoning
💭 Assumption Filled unstated requirement What, Basis
⚠️ Concern Potential risk to flag Issue, Impact, Mitigation
🚨 Warning Serious risk Risk, Likelihood, Consequence
📊 Confidence Uncertainty <90% Claim, Level (X/10), Basis
↩️ Pushback Disagree with direction Position, Reasoning
💡 Suggestion Optional improvement Idea, Benefit
🔄 Reflection Applied prior learning Prior, Learning, Application

End Boxes (max 3, in order)

Box When Fields
📋 Follow Ups Next steps exist Immediate, Consider, Related
🏁 Completion Task being completed Request, Completed, Confidence, Gaps, Improve
✅ Quality Code was written Rating (X/10), Justification

Installation

Quick Install (User-Level)

curl -sSL https://raw.githubusercontent.com/AIntelligentTech/agent-response-boxes/main/install.sh | bash

This installs:

  • Claude Code (full mode):
    • Output style + rules
    • SessionStart/SessionEnd hooks for cross-session learning
    • Analysis skill (/analyze-boxes)
    • CLAUDE.md snippet with pre-response checklist
  • Shared local event store at ~/.response-boxes/analytics/boxes.jsonl (created on first use)

You can optionally install integrations for other agents using flags (see below).

Project-Level Install

curl -sSL https://raw.githubusercontent.com/AIntelligentTech/agent-response-boxes/main/install.sh | bash -s -- --project

Installs rules to project .claude/ directory only. Hooks, skills, and analytics remain user-level.

Installer Options

# Preview changes without modifying files
curl -sSL https://raw.githubusercontent.com/AIntelligentTech/agent-response-boxes/main/install.sh | bash -s -- --dry-run

# Overwrite managed files
curl -sSL https://raw.githubusercontent.com/AIntelligentTech/agent-response-boxes/main/install.sh | bash -s -- --force

# Remove legacy v3 artifacts (if present)
curl -sSL https://raw.githubusercontent.com/AIntelligentTech/agent-response-boxes/main/install.sh | bash -s -- --cleanup-legacy

Optional agent integrations

These are opt-in installs on top of the Claude Code baseline:

# OpenCode (plugin)
curl -sSL https://raw.githubusercontent.com/AIntelligentTech/agent-response-boxes/main/install.sh | bash -s -- --install-opencode

# Windsurf basic mode (rules only)
curl -sSL https://raw.githubusercontent.com/AIntelligentTech/agent-response-boxes/main/install.sh | bash -s -- --install-windsurf-basic

# Windsurf enhanced mode (collection hook + workflow + rules)
curl -sSL https://raw.githubusercontent.com/AIntelligentTech/agent-response-boxes/main/install.sh | bash -s -- --install-windsurf

# Cursor basic mode (project-level rules only)
curl -sSL https://raw.githubusercontent.com/AIntelligentTech/agent-response-boxes/main/install.sh | bash -s -- --project --install-cursor-basic

# Cursor enhanced mode (collection hook + manual context skill)
curl -sSL https://raw.githubusercontent.com/AIntelligentTech/agent-response-boxes/main/install.sh | bash -s -- --install-cursor

Installer Coverage

  • Detection: Reports what is already installed (user-level and project-level)
  • Safety: Backs up files before modifying them
  • Scope-aware: --project installs/uninstalls only project rules; hooks/analytics remain user-level
  • Idempotent: Re-running install is safe; unchanged files are skipped

Activate

/output-style response-box

Or set as default in ~/.claude/settings.json:

{
  "outputStyle": "response-box"
}

Dependencies

  • jq — Required for hooks (JSON processing)
  • bash — Hooks are bash scripts
  • git — Optional, for repository context in box metadata
  • bun — Optional, for running OpenCode plugin tests (contributors only)

What Gets Installed

User-Level (~/.claude/ + ~/.response-boxes/)

~/.claude/
├── output-styles/
│   └── response-box.md           # Active output style
├── rules/
│   ├── response-boxes.md         # Complete specification (12 box types)
│   └── anti-sycophancy.md        # Research-backed anti-sycophancy protocol
├── hooks/
│   ├── inject-context.sh         # SessionStart: load prior boxes
│   └── session-processor.sh      # SessionEnd: collect and emit box events
├── skills/
│   └── analyze-boxes/
│       └── SKILL.md              # /analyze-boxes skill

~/.response-boxes/
└── analytics/
    └── boxes.jsonl               # Event store (created on first use)

Project-Level (.claude/)

.claude/
├── rules/
│   ├── response-boxes.md         # Full specification
│   └── anti-sycophancy.md        # Anti-sycophancy protocol
└── CLAUDE.md                     # Updated with snippet

Cross-Session Learning

How It Works

  1. Collection — At session end, session-processor.sh parses the transcript for response boxes and appends BoxCreated events to the event store.

  2. Synthesis — When you run /analyze-boxes, Claude proposes and (with your approval) appends learning events (e.g. LearningCreated, EvidenceLinked) and can link learnings into meta-learnings (e.g. LearningLinked).

  3. Injection — At session start, inject-context.sh loads relevant boxes and learnings via projection from the event store and injects them as context. If new boxes exist since the last analysis run, it also injects a one-line reminder to run /analyze-boxes.

Automation vs Manual Steps

  • Automated
    • SessionEnd collects boxes and appends BoxCreated events
    • SessionStart injects projected learnings/boxes and the “unanalyzed” reminder
  • Manual
    • You run /analyze-boxes and approve any changes written to the event store

Analytics Compatibility

  • Legacy support: Older boxes.jsonl lines missing .event are treated as BoxCreated.
  • Versioning: New BoxCreated events include schema_version.
  • Guardrail: If the event store contains a newer schema version than the hooks support, the hook injects a clear “update required” message instead of producing incorrect context.
  • Recovery: If the event store becomes corrupted or is not JSONL, the hooks will inject a diagnostic message. Back up ~/.response-boxes/analytics/boxes.jsonl and repair/reset it to restore collection/injection.

Scoring

Boxes are scored by type, prioritizing actionable learnings:

Tier Types Score
High Reflection, Warning, Pushback, Assumption 80-90
Medium Choice, Completion, Concern, Confidence, Decision 55-70
Low Suggestion, Quality, FollowUps 35-50

What Gets Injected

At session start, you may see:

PRIOR SESSION LEARNINGS (high-value boxes from previous sessions):

- Assumption: Assumed "TypeScript" (USER CORRECTED) [github.com/user/repo]
- Choice: Chose Zod over Yup (USER PREFERRED ALTERNATIVE) [github.com/user/repo]
- Warning: DELETE endpoint has no authentication [github.com/org/api]

Review these before responding. Apply relevant learnings using 🔄 Reflection box.

Manual Analysis

Run AI-powered analysis in Claude Code:

/analyze-boxes

This will:

  • Identify patterns across recent boxes
  • Propose learnings with confidence scoring
  • Link evidence (boxes) to learnings
  • Append approved events back into the event store

Manual Gaps / Limitations

  • Analysis is nondeterministic
    • /analyze-boxes is AI-driven pattern recognition. Results can differ across runs.
    • Always review proposed events before approving writes to the event store.
  • Hooks do not auto-run analysis
    • The SessionStart hook only injects a reminder when new boxes exist.
    • You still run /analyze-boxes manually.

Pre-Response Checklist

Before completing any substantive response:

[ ] Selected between alternatives?      → ⚖️ Choice
[ ] Made a judgment call?               → 🎯 Decision
[ ] Filled unstated requirement?        → 💭 Assumption
[ ] Completing a task?                  → 🏁 Completion

When to Use Each Box

Always Required

  • 🏁 Completion — Every task completion

Use When Applicable

  • ⚖️ Choice — Actively chose between viable alternatives
  • 🎯 Decision — Made judgment without comparing options
  • 💭 Assumption — Filled in unstated requirements
  • ⚠️ Concern — Identified potential issue

Use When Needed

  • 📊 Confidence — Meaningful uncertainty (<90%)
  • ↩️ Pushback — Genuine disagreement with direction
  • 💡 Suggestion — Optional improvement not requested
  • 🚨 Warning — Serious risk requiring attention
  • 🔄 Reflection — Applying learning from prior turn/session
  • Quality — Significant code was written
  • 📋 Follow Ups — Clear next steps exist

Skip Boxes For

  • Simple confirmations ("Done.")
  • Single-action completions under 300 characters
  • File reads without analysis

Configuration

Environment Variables

Variable Default Description
BOX_INJECT_LEARNINGS 3 Max learnings to inject at start
BOX_INJECT_BOXES 5 Max boxes to inject at start
BOX_INJECT_DISABLED false Set to "true" to disable hook-based injection
RESPONSE_BOXES_DISABLED false Set to "true" to disable all adapters (hooks, plugins)
BOX_RECENCY_DECAY 0.95 Weekly decay factor

Settings

Hook registration in ~/.claude/settings.json:

{
  "hooks": {
    "SessionStart": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "~/.claude/hooks/inject-context.sh"
          }
        ]
      }
    ],
    "SessionEnd": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "~/.claude/hooks/session-processor.sh"
          }
        ]
      }
    ]
  }
}

Best Practices: Integrating with Claude Code

To get maximum benefit, integrate response boxes across output style, rules, CLAUDE.md, hooks, and skills so the metacognition loop is end-to-end.

1. Use the Response Box output style

  • Goal: Make boxes the default response format so reasoning is consistently captured.
  • Action: Set output style to response-box (either per-session with /output-style response-box or as a default in ~/.claude/settings.json).

2. Keep the spec authoritative (rules)

  • Goal: Avoid drift between “how we write boxes” and “how we parse/analyze them”.
  • Action: Treat the Response Box spec (response-boxes.md, installed at ~/.claude/rules/response-boxes.md or project .claude/rules/response-boxes.md) as the single source of truth for:
    • Box formats/fields (so hooks can parse reliably)
    • When boxes are required (so the event store has consistent signal)

3. Wire CLAUDE.md so every session starts with the right mindset

  • Goal: Make “review prior boxes + apply learnings” a first-class pre-response habit.
  • Action: Add the snippet (agents/claude-code/config/claude-md-snippet.md) to your project’s Claude instructions (commonly .claude/CLAUDE.md, or CLAUDE.md depending on how you run Claude Code).
  • Recommendation: Keep project-specific constraints adjacent to the snippet, for example:
    • Dependency/package manager expectations
    • Build/test commands
    • Coding standards (TypeScript strict, formatting, etc.)

4. Turn cross-session learning on (hooks)

  • Goal: Persist evidence automatically and reinject the most relevant context.
  • Action: Ensure both hooks are registered:
    • SessionEnd collects boxes into the event store (BoxCreated)
    • SessionStart injects recent high-value boxes + top learnings/meta-learnings

If you use other SessionStart/SessionEnd automations, keep hook ordering intentional:

  • SessionStart: run context injection early so subsequent steps see the injected learnings
  • SessionEnd: run box collection reliably so analysis has complete evidence

5. Make skills “box-aware”

If you build additional Claude Code skills (beyond /analyze-boxes), design them to consume and produce the same signals:

  • Inputs (read-only)

    • Read injected context (patterns + notable boxes)
    • Read ~/.response-boxes/analytics/boxes.jsonl if the skill needs cross-session history
  • Outputs (append-only)

    • Prefer emitting new events (e.g. LearningCreated, EvidenceLinked, LearningLinked) over rewriting prior ones
    • Keep the user-in-the-loop for writes: propose, then ask for approval

6. Protect parser compatibility (avoid format drift)

  • Goal: Keep collection reliable; small format changes can silently reduce recall.
  • Action: Preserve the standard box header and delimiter format (emoji + type + 45 dashes).
  • Action: If you add new box types or fields, update the parsing/analysis components in lockstep (SessionEnd parser and /analyze-boxes expectations).

7. Operational workflow that compounds learning

  • During a session

    • Use boxes at the point of relevance (assumptions, choices, concerns)
    • When a prior correction or preference applies, start with 🔄 Reflection
  • Between sessions

    • When SessionStart injects the “unanalyzed boxes” reminder, run /analyze-boxes
    • Treat injected patterns as constraints to follow, not suggestions to ignore

Cross-Agent Architecture

Response Boxes supports multiple AI coding agents through a shared event store architecture. Each agent has adapters that emit and consume the same event types.

Core invariants (all agents)

  • Single event store: ~/.response-boxes/analytics/boxes.jsonl is the canonical append-only JSONL log for all agents.
  • Human-in-the-loop writes: only BoxCreated is emitted automatically; higher-level learning events are created via explicit /analyze-boxes runs.
  • Same box/learning schema: adapters for other agents emit the same event shapes so inject-context.sh and /analyze-boxes stay authoritative.

Claude Code

  • Hooks: SessionStart injects context, SessionEnd collects boxes
  • Skill: /analyze-boxes for AI-powered pattern analysis
  • Output Style: Native response-box format
  • Status: Full support (reference implementation)

OpenCode

  • Plugin: Handles collection via message.updated and injection via chat.system.transform
  • Skill: /analyze-boxes (native skill distribution)
  • Instructions: Static guidance via response-boxes.md
  • Status: Full support

Windsurf

  • Hook: post_cascade_response for automatic collection
  • Workflow: /response-boxes-start for manual injection
  • Rules: Always-on box taxonomy guidance
  • Status: Enhanced support (automatic collection, manual injection)

Cursor

  • Hook: afterAgentResponse for automatic collection (observation-only)
  • Skill: /response-boxes-context for manual injection
  • Rules: .cursor/rules/response-boxes.mdc
  • Status: Basic support (automatic collection, manual injection)

Optional OpenSkills support

For agents that do not have native skills discovery, the Agent Skills / OpenSkills ecosystem can be used to install SKILL.md files and advertise them via AGENTS.md. For Cursor/Windsurf/OpenCode, OpenSkills is treated as optional; native skills and plugins remain the primary integration path.

OpenCode Setup (Optional, Experimental)

You can connect OpenCode to the same Response Boxes event store used by Claude Code so both agents share boxes and learnings.

1. Prerequisites

  • OpenCode installed and configured on your machine.
  • Agent Response Boxes installed at user scope in full mode (hooks + /analyze-boxes), so the event store exists and analysis can synthesize LearningCreated events.

2. Install the OpenCode plugin

Run the installer with the OpenCode flag (user-level only):

# User-level, full mode (default)
curl -sSL https://raw.githubusercontent.com/AIntelligentTech/agent-response-boxes/main/install.sh \
  | bash -s -- --install-opencode

Notes:

  • The plugin is installed to:
    • ~/.config/opencode/plugin/response-boxes.plugin.ts
  • The plugin is only installed in full mode:
    • If you pass --basic, the installer will skip hooks, skills, analytics, and the OpenCode plugin even when --install-opencode is set.
  • The plugin writes and reads from the same event store as Claude Code:
    • ~/.response-boxes/analytics/boxes.jsonl
  • To temporarily disable the plugin (and other adapters), set:
    • RESPONSE_BOXES_DISABLED=true in your environment before launching OpenCode.

3. Make Response Boxes part of your OpenCode instructions

The plugin handles collection and dynamic context injection, but you still need to teach OpenCode how to write boxes.

Follow OpenCode’s documentation for configuring agent instructions (e.g. AGENTS.md or opencode.json) and include:

  • The Response Boxes rules/spec:
    • outputs/claude/.claude/rules/response-boxes.md (or your project .claude/rules/response-boxes.md)
  • Your Claude configuration snippet:
    • The same content installed from outputs/claude/.claude/config/claude-md-snippet.md into CLAUDE.md

The exact syntax depends on how you configure agents in OpenCode; treat these files as always-on instructions for your Claude-like coding agent.

4. Usage in OpenCode

Once installed and configured:

  1. Open a project in OpenCode and start a coding session.
  2. Write responses using the Response Box format (Choice, Completion, Assumption, etc.) just as you would in Claude Code.
  3. The OpenCode plugin will:
    • Parse your assistant responses for boxes.
    • Append BoxCreated events into ~/.response-boxes/analytics/boxes.jsonl with context indicating source: "opencode_plugin".
  4. When LearningCreated events exist (typically from running /analyze-boxes in Claude Code), the plugin will:
    • Project Patterns and Recent notable boxes from the event store.
    • Inject a short summary into the system prompt via experimental.chat.system.transform once per OpenCode session.

Analysis and learning synthesis still happens via /analyze-boxes in Claude Code. OpenCode then consumes those learnings and contributes new boxes back into the shared event store.


Troubleshooting

Boxes not being collected

Symptoms: boxes.jsonl not updating after responses

Check:

  1. Hook/plugin is installed: ls ~/.claude/hooks/ or check plugin config
  2. Hook permissions: chmod +x ~/.claude/hooks/*.sh
  3. jq is available: which jq
  4. Hook is registered in ~/.claude/settings.json

Debug:

export DEBUG_RESPONSE_BOXES=1
# Run agent and check stderr

Learnings not injecting

Symptoms: No cross-session context appearing at session start

Check:

  1. Event store has content: wc -l ~/.response-boxes/analytics/boxes.jsonl
  2. Injection hook is registered
  3. BOX_INJECT_DISABLED is not set to true

Debug:

# Manually test injection
~/.claude/hooks/inject-context.sh < /dev/null

Skill not found

Symptoms: /analyze-boxes not recognized

Check:

  1. Skill is installed: ls ~/.claude/skills/analyze-boxes/
  2. Skill file has correct YAML frontmatter
  3. Agent supports skills (Cursor requires Nightly)

Event store corruption

Symptoms: Errors during injection or analysis

Fix:

# Backup and reset
cp ~/.response-boxes/analytics/boxes.jsonl ~/.response-boxes/analytics/boxes.jsonl.bak
echo "" > ~/.response-boxes/analytics/boxes.jsonl

For more troubleshooting, see the Cross-Agent Compatibility Guide.


Documentation

  • Architecture — Technical architecture, design decisions, and data structures
  • Cross-Agent Compatibility — Detailed agent capability matrices and integration patterns
  • Rules — Complete box specifications and usage guidelines (12 box types)
  • Anti-Sycophancy — Research-backed protocol for preventing sycophantic behavior
  • Released artifacts — Installable scaffolding used by the installer
  • Security — Security policy and data handling
  • Support — How to get help or report issues

Contributing

Contributions welcome. Please see CONTRIBUTING.md.

  1. Fork the repository
  2. Create a feature branch
  3. Commit changes
  4. Open a Pull Request

License

MIT — Use freely, attribution appreciated.


Made with care by AIntelligentTech

About

A metacognitive annotation system for Claude Code — structured transparency into AI reasoning, decisions, and self-assessment

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •