feat: observational memory — programmatic extraction + observation block by platypusrex · Pull Request #70 · ollielabs/olliecode

platypusrex · 2026-02-23T05:13:05Z

Summary

Adds observational memory (Phase 0) — programmatic extraction of structured observations from tool calls, stored in SQLite, injected into the system prompt. Observations survive compaction because they live in a separate table, not in message history.

Closes #65. Follow-up LLM-based extraction tracked in #69.

What it does

After every tool call, pure-function extractors produce typed observations from the tool name, args, and result. These are stored in SQLite and formatted into an <observations> block injected into the system prompt before each LLM call.

Extracted observations:

Tool	Type	Importance
`edit_file`	`file_modified`	7
`write_file`	`file_created`	7
`read_file`	`file_read`	3
`run_command` (success)	`command_run`	4
`run_command` (failure)	`command_error`	8
`glob` / `grep`	`search_performed`	3
`todo_write`	`todo_updated`	5
`task`	`task_delegated`	5

Observation block in system prompt:

<observations>
## Modified Files
- src/agent/index.ts (modified x 3)

## Errors
- Failed: bun test -> exit 1: TypeError...

## Commands
- bun check:types -> exit 0

## Searches
- grep "onToolResult" -> 3 matches
</observations>

Key design decisions:

SystemPromptContext.observationBlock field (not string concatenation) — prompt content belongs in the prompt layer
source: "programmatic" | "llm" on the Observation type — future-proofs for LLM extraction (feat: Observational Memory v2 — Observer/Reflector context management #69)
2000 token budget (not 800) — target models have 200K-1M+ context windows
try/catch around observation storage — never crashes the agent loop
Args captured at onToolCall time via Map, not looked up from store at result time

Files changed

New (6):

src/memory/types.ts — Observation type definitions
src/memory/store.ts — SQLite persistence layer
src/memory/extractors.ts — Programmatic extractors for 8 tool types
src/memory/working-memory.ts — Observation block builder with dedup + token budget
tests/test-memory.ts — 37 unit tests
tests/test-memory-integration.ts — 7 integration tests (full pipeline)

Modified (7):

src/session/migrations.ts — Migration v5: observations table
src/session/db.ts — setDatabaseForTesting for test injection
src/agent/prompts/shared.ts — observationBlock on SystemPromptContext
src/agent/prompts/build.ts + plan.ts — Include observation block
src/agent/index.ts — observationBlock on RunAgentArgs, debug logging
src/tui/hooks/use-agent-submit.ts — Extract observations + build block

Testing

44 tests, 130 assertions — all passing
Unit tests: extractors (13), store (10), block builder (10), edge cases (4)
Integration tests: full pipeline extraction-to-prompt (7) covering block placement, dedup, session isolation, read omission, denied/blocked handling, both build + plan modes
Manual test: confirmed observations stored in SQLite and system prompt grew by expected amount on second turn

Out of scope

LLM-based extraction (tracked in feat: Observational Memory v2 — Observer/Reflector context management #69)
Staleness detection / cross-session persistence
FTS5 / semantic retrieval

…ractors, builder, tests Implements Phase 0 of observational memory (issue #65): - src/memory/types.ts: Observation type with source field for LLM fast-follow - src/memory/store.ts: SQLite-backed observation persistence - src/memory/extractors.ts: Programmatic extractors for 8 tool types - src/memory/working-memory.ts: Observation block builder with dedup + token budget - src/session/migrations.ts: Migration v5 adds observations table - src/session/db.ts: setDatabaseForTesting helper for test injection - tests/test-memory.ts: 37 tests covering extractors, store, and builder

…ompt Integration layer: - SystemPromptContext gains observationBlock field (shared.ts) - Build and plan mode prompts include observation block after env block - RunAgentArgs accepts observationBlock, passes to SystemPromptContext - use-agent-submit extracts observations after each tool result - use-agent-submit builds observation block before each runAgent call - Fixed lint: removed unused ObservationType import, replaced non-null assertion

…bservation storage W1: Use argsByIndex map to capture tool args at onToolCall time instead of fragile store.getPendingDisplayMessages() lookup. O(1) vs O(n), no coupling to store internals. W2: Wrap observation extraction + storage in try/catch so SQLite errors cannot crash the agent loop. Observational memory is non-critical.

Logs observation block char count to debug.log when present, making it visible during manual testing.

…mpt pipeline 7 integration tests proving observations flow through the complete pipeline: extractors → SQLite store → block builder → SystemPromptContext → system prompt. Validates block placement, deduplication, session isolation, read omission, denied/blocked handling, and both build + plan mode prompts.

changeset-bot · 2026-02-23T05:13:10Z

🦋 Changeset detected

Latest commit: 0a0e8a5

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package

Name	Type
olliecode	Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Apply MAX_ERRORS=10 cap at render time so error-heavy sessions don't blow up the observation block. Fix blank line in build/plan prompts when observationBlock is null. Document staleness limitation in plan doc.

platypusrex added 6 commits February 22, 2026 22:50

feat(memory): add debug logging when observation block is injected

3959be4

Logs observation block char count to debug.log when present, making it visible during manual testing.

chore: add changeset for observational memory feature

df8c28d

fix(memory): cap errors at 10, eliminate blank line when no observations

0a0e8a5

Apply MAX_ERRORS=10 cap at render time so error-heavy sessions don't blow up the observation block. Fix blank line in build/plan prompts when observationBlock is null. Document staleness limitation in plan doc.

platypusrex merged commit 3e03601 into main Feb 23, 2026
1 check passed

This was referenced Feb 26, 2026

feat: Observational Memory v2 — Observer/Reflector context management #69

Open

chore(memory): Phase 4 — Remove programmatic extraction and evaluate compaction removal #74

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: observational memory — programmatic extraction + observation block#70

feat: observational memory — programmatic extraction + observation block#70
platypusrex merged 7 commits intomainfrom
feat/observational-memory

platypusrex commented Feb 23, 2026

Uh oh!

changeset-bot bot commented Feb 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

platypusrex commented Feb 23, 2026

Summary

What it does

Files changed

Testing

Out of scope

Uh oh!

changeset-bot bot commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

changeset-bot bot commented Feb 23, 2026 •

edited

Loading