Skip to content

feat: observational memory — programmatic extraction + observation block#70

Merged
platypusrex merged 7 commits intomainfrom
feat/observational-memory
Feb 23, 2026
Merged

feat: observational memory — programmatic extraction + observation block#70
platypusrex merged 7 commits intomainfrom
feat/observational-memory

Conversation

@platypusrex
Copy link
Contributor

Summary

Adds observational memory (Phase 0) — programmatic extraction of structured observations from tool calls, stored in SQLite, injected into the system prompt. Observations survive compaction because they live in a separate table, not in message history.

Closes #65. Follow-up LLM-based extraction tracked in #69.

What it does

After every tool call, pure-function extractors produce typed observations from the tool name, args, and result. These are stored in SQLite and formatted into an <observations> block injected into the system prompt before each LLM call.

Extracted observations:

Tool Type Importance
edit_file file_modified 7
write_file file_created 7
read_file file_read 3
run_command (success) command_run 4
run_command (failure) command_error 8
glob / grep search_performed 3
todo_write todo_updated 5
task task_delegated 5

Observation block in system prompt:

<observations>
## Modified Files
- src/agent/index.ts (modified x 3)

## Errors
- Failed: bun test -> exit 1: TypeError...

## Commands
- bun check:types -> exit 0

## Searches
- grep "onToolResult" -> 3 matches
</observations>

Key design decisions:

  • SystemPromptContext.observationBlock field (not string concatenation) — prompt content belongs in the prompt layer
  • source: "programmatic" | "llm" on the Observation type — future-proofs for LLM extraction (feat: Observational Memory v2 — Observer/Reflector context management #69)
  • 2000 token budget (not 800) — target models have 200K-1M+ context windows
  • try/catch around observation storage — never crashes the agent loop
  • Args captured at onToolCall time via Map, not looked up from store at result time

Files changed

New (6):

  • src/memory/types.ts — Observation type definitions
  • src/memory/store.ts — SQLite persistence layer
  • src/memory/extractors.ts — Programmatic extractors for 8 tool types
  • src/memory/working-memory.ts — Observation block builder with dedup + token budget
  • tests/test-memory.ts — 37 unit tests
  • tests/test-memory-integration.ts — 7 integration tests (full pipeline)

Modified (7):

  • src/session/migrations.ts — Migration v5: observations table
  • src/session/db.tssetDatabaseForTesting for test injection
  • src/agent/prompts/shared.tsobservationBlock on SystemPromptContext
  • src/agent/prompts/build.ts + plan.ts — Include observation block
  • src/agent/index.tsobservationBlock on RunAgentArgs, debug logging
  • src/tui/hooks/use-agent-submit.ts — Extract observations + build block

Testing

  • 44 tests, 130 assertions — all passing
  • Unit tests: extractors (13), store (10), block builder (10), edge cases (4)
  • Integration tests: full pipeline extraction-to-prompt (7) covering block placement, dedup, session isolation, read omission, denied/blocked handling, both build + plan modes
  • Manual test: confirmed observations stored in SQLite and system prompt grew by expected amount on second turn

Out of scope

…ractors, builder, tests

Implements Phase 0 of observational memory (issue #65):
- src/memory/types.ts: Observation type with source field for LLM fast-follow
- src/memory/store.ts: SQLite-backed observation persistence
- src/memory/extractors.ts: Programmatic extractors for 8 tool types
- src/memory/working-memory.ts: Observation block builder with dedup + token budget
- src/session/migrations.ts: Migration v5 adds observations table
- src/session/db.ts: setDatabaseForTesting helper for test injection
- tests/test-memory.ts: 37 tests covering extractors, store, and builder
…ompt

Integration layer:
- SystemPromptContext gains observationBlock field (shared.ts)
- Build and plan mode prompts include observation block after env block
- RunAgentArgs accepts observationBlock, passes to SystemPromptContext
- use-agent-submit extracts observations after each tool result
- use-agent-submit builds observation block before each runAgent call
- Fixed lint: removed unused ObservationType import, replaced non-null assertion
…bservation storage

W1: Use argsByIndex map to capture tool args at onToolCall time instead
of fragile store.getPendingDisplayMessages() lookup. O(1) vs O(n), no
coupling to store internals.

W2: Wrap observation extraction + storage in try/catch so SQLite errors
cannot crash the agent loop. Observational memory is non-critical.
Logs observation block char count to debug.log when present,
making it visible during manual testing.
…mpt pipeline

7 integration tests proving observations flow through the complete pipeline:
extractors → SQLite store → block builder → SystemPromptContext → system prompt.

Validates block placement, deduplication, session isolation, read omission,
denied/blocked handling, and both build + plan mode prompts.
@changeset-bot
Copy link

changeset-bot bot commented Feb 23, 2026

🦋 Changeset detected

Latest commit: 0a0e8a5

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
olliecode Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Apply MAX_ERRORS=10 cap at render time so error-heavy sessions don't
blow up the observation block. Fix blank line in build/plan prompts
when observationBlock is null. Document staleness limitation in plan doc.
@platypusrex platypusrex merged commit 3e03601 into main Feb 23, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: observational memory — programmatic extraction + working memory block

1 participant