Your agent's memory shouldn't be a markdown file.
ShadowDB is an easy-to-install memory plugin for OpenClaw that replaces flat files with a real database — semantic search, fuzzy matching, and a memory that gets smarter over time instead of bloating.
Built by an agent, for agents.
Gives your agent a persistent memory it can search, write, update, and delete — instead of flat markdown files that get shoved into every prompt. Works with Postgres (recommended), SQLite, or MySQL.
Why this matters: Most agent frameworks inject your agent's entire identity — personality, rules, preferences, everything — into every single API call. That's ~9,000 bytes of static text the model already read, re-sent every turn, wasting tokens and pushing out conversation history. ShadowDB adds zero extra tokens to the prompt. The agent searches for what it needs, when it needs it. Everything else stays in the database, not the prompt.
| Tool | Does |
|---|---|
memory_search |
Find relevant records (semantic + keyword + fuzzy) |
memory_get |
Read a full record |
memory_write |
Save something new |
memory_update |
Edit an existing record |
memory_delete |
Soft-delete (reversible for 30 days) |
memory_undelete |
Undo a delete |
curl -fsSL https://raw.githubusercontent.com/jamesdwilson/Sh4d0wDB/main/setup.sh | bashThat's it. The script downloads only the files you need, sets up the database, installs dependencies, wires the plugin into OpenClaw, and restarts the gateway. Run the same command again to update.
Or just tell your agent — it can run the command itself. The script auto-detects non-interactive mode and defaults to SQLite with zero prompts. Pass --backend postgres or --backend mysql to override.
ShadowDB runs everywhere OpenClaw runs. The install script is plain bash — no exotic tooling.
| Platform | Works? | Notes |
|---|---|---|
| macOS | ✅ | Primary development platform. Just works. |
| Linux | ✅ | Servers, Raspberry Pi, VPS — all good. |
| Windows (WSL2) | ✅ | OpenClaw requires WSL2 on Windows. Our bash script runs natively inside WSL. Same ~/.openclaw/ path as Linux. |
ShadowDB is just TypeScript files dropped into your OpenClaw plugins directory. No global installs, no system-level changes. Here's exactly what happens:
-
Plugin files →
~/.openclaw/plugins/memory-shadowdb/— the.tssource files, plugin manifest, andpackage.json. -
Core dependencies →
npm installinside the plugin directory. Two packages:@sinclair/typebox(config schema) andopenai(embedding API client). These live in the plugin's ownnode_modules/, not globally. -
Your database driver — only the one you picked:
- Postgres →
pg - SQLite →
better-sqlite3+sqlite-vec - MySQL →
mysql2
Also installed inside the plugin's
node_modules/. Nothing global. - Postgres →
-
System dependencies — the setup script checks for these and tells you if anything's missing:
- A database server (Postgres or MySQL) — unless you chose SQLite, which runs in-process with no server.
- Ollama (optional) — for local embeddings. Semantic search works without it if you configure an API-based embedding provider.
- Node.js — but OpenClaw already requires this, so you have it.
Everything lives inside ~/.openclaw/plugins/memory-shadowdb/. Nothing is installed globally. Nothing touches your system paths. Uninstall removes the directory and you're clean.
Breathe. Nothing was lost.
ShadowDB doesn't delete, overwrite, or modify any of your original files. Here's exactly what the install touched — and how to undo every bit of it:
What install changed (and what it didn't)
| What | Where | Reversible? |
|---|---|---|
| Downloaded plugin files | ~/.openclaw/plugins/memory-shadowdb/ |
✅ Moved to trash on uninstall |
| Added a config entry | plugins.entries.memory-shadowdb in openclaw.json |
✅ Removed on uninstall |
| Set the memory slot | plugins.slots.memory in openclaw.json |
✅ Cleared on uninstall |
| Backed up your config first | ~/OpenClaw-Before-ShadowDB-[install date]/openclaw.json |
Your original config, untouched |
| Created a database | shadow (Postgres/MySQL) or shadow.db (SQLite) |
✅ Kept on uninstall (your data is yours) |
Imported workspace .md files as memories |
Rows in the memories table |
✅ Kept on uninstall — originals untouched |
Imported PRIMER.md / ALWAYS.md |
Rows in the primer table |
✅ Kept on uninstall — originals untouched |
- ❌ Did not delete or rename any
.mdfiles - ❌ Did not modify
MEMORY.md,SOUL.md,IDENTITY.md, or any other workspace file - ❌ Did not change your agent's system prompt
- ❌ Did not touch any other plugin's config
Your original markdown files are still exactly where you left them.
One command. Same script, different flag:
curl -fsSL https://raw.githubusercontent.com/jamesdwilson/Sh4d0wDB/main/setup.sh | bash -s -- --uninstallThis moves the plugin files to your system trash (macOS Trash, GNOME Trash, or a recovery folder if no trash is available), removes the config entry, restarts OpenClaw, and you're back to your original setup. Your database and all its records are kept. If you reinstall later, everything will still be there.
Your original openclaw.json is saved at ~/OpenClaw-Before-ShadowDB-[install date]/openclaw.json — easy to find, impossible to miss.
Design principle: ShadowDB will never delete a file, drop a database, or remove anything that can't be put back. Not because we forgot — because we specifically chose not to. Even uninstall moves files to your system trash — not
rm -rf. Your data stays unless you empty the trash.
Or tell your agent — same as install, it knows what to do.
Records don't expire. A phone number from 3 months ago is still a phone number. A project status from 3 months ago probably isn't current — but that's a judgment call, not something the database should guess at.
ShadowDB gives the agent two pieces of information and lets it decide:
-
Age in snippets — search results show
[topic] | 5d agoinstead of a raw timestamp. The agent reads "5 days ago" the same way you would. This matters because models are bad at date math — ask one to compute "how many days between Feb 10 and Feb 15" and it'll confidently say 3 or 6. Pre-computing the age removes that failure mode. -
Recency as a tiebreaker — newer records get a small ranking boost (weight:
0.15), but a relevant old record still beats a vaguely relevant new one.
Deletes are always reversible for 30 days. After that, automatic cleanup kicks in — but even then, expired records are exported to a JSON file and moved to your system trash before being removed from the database. There is no hard-delete tool — the agent can never permanently destroy data. Only time can, and even time leaves a receipt.
Why not something more complex?
| Idea | Why we skipped it |
|---|---|
| Staleness markers | created_at already tells you how old it is |
| "Superseded by" pointers | Just delete the old one and write the new one |
| Access frequency tracking | Creates feedback loops; popular ≠ good |
| Auto-contradiction detection | Similarity ≠ contradiction; false positives everywhere |
| Dedup on write | Blocks legitimate updates and related-but-different facts |
The principle: if the guardrails are more complex than the feature, you've lost the trade.
Hybrid ranking with multiple signals
Every search combines multiple signals to find the best matches. What's available depends on your backend:
| Signal | Postgres | SQLite | MySQL | What it measures |
|---|---|---|---|---|
| Vector similarity | ✓ (weight: 0.7) |
✓ (sqlite-vec) | ✓ (9.2+) | Semantic meaning via embeddings |
| Full-text search | ✓ (weight: 0.3) |
✓ (FTS5) | ✓ (FULLTEXT) | Keyword/phrase matches |
| Trigram similarity | ✓ (weight: 0.2) |
✓ (FTS5 trigram) | ✓ (ngram parser) | Fuzzy/substring matching |
| Recency boost | ✓ (weight: 0.15) |
✓ | ✓ | Newer records boosted slightly |
With Postgres, signals are merged via Reciprocal Rank Fusion (RRF) — each signal produces a ranked list, and RRF combines them without needing score normalization. All weights are configurable.
Recency is intentionally low — it's a tiebreaker, not a dominant signal.
Benchmarks, token economics, and why flat files have a ceiling
All benchmarks measured on a MacBook Pro M3 Max against a real production knowledge base (6,800+ records, 768-dim embeddings). ShadowDB numbers are from the live system.
The three memory systems:
| OpenClaw Builtin | QMD | ShadowDB | |
|---|---|---|---|
| What it is | Flat .md files + SQLite embedding index |
External CLI sidecar (BM25 + vectors + reranking) | Database plugin (Postgres, SQLite, or MySQL) |
| Source of truth | Markdown files | Markdown files (QMD indexes them) | The database |
| Search | Embedding similarity only | BM25 + vector + reranker | FTS + vector + trigram + recency (RRF) |
| Identity delivery | Static files loaded every turn | Static files loaded every turn | Primer table — injected once, cached |
| Write model | Agent writes .md files |
Agent writes .md files, QMD re-indexes |
memory_write → DB row (instant, searchable) |
| Agent can create memories | ✅ Native memory_write tool |
What this compares: how your agent finds information. With flat files (Builtin and QMD), everything gets loaded into every prompt, the model digs through it, and then sends what it found back down — you're paying three times (tokens up, attention wasted, tokens back) for something a database does in one step. QMD improves the search over those files, but doesn't change the underlying architecture. ShadowDB replaces the architecture: the database finds what's relevant first, and the model only sees what matters.
| Operation | OpenClaw Builtin | QMD | ShadowDB (Postgres) |
|---|---|---|---|
| Load identity + knowledge | 45ms (read 8 files) | 45ms (same files, QMD only handles search) | 0ms (primer already in prompt) |
| Keyword search ("Watson") | ❌ Embedding-only | BM25 ✅ | 55ms FTS ✅ |
| Semantic search ("Watson's military service") | 200–500ms (embedding only) | ~200ms (vector + reranker) | 230ms (FTS + vector + trigram + RRF) |
| Fuzzy/typo search ("Watsn") | ❌ Not supported | ❌ Not supported | 60ms trigram ✅ |
| Search cold start | 1–3s (load embedding model) | 2–10s (may download GGUF models on first query) | 55ms (FTS always hot, PG always running) |
| Sub-agent identity load | ∞ (filtered out) | ∞ (filtered out — same file system) | <1ms (primer injection) |
QMD significantly improves search quality over the builtin (BM25 + reranking is a real upgrade), but it doesn't change the file-based architecture. Identity files still get loaded every turn. Sub-agents still can't access personality. The token waste problem remains.
| Dimension | OpenClaw Builtin | QMD | ShadowDB |
|---|---|---|---|
| Max knowledge base size | ~500 items before MEMORY.md hits 20K char truncation. Middle of file silently dropped. | Same files, better search over them. Still limited by what fits in .md files. |
No limit. PostgreSQL handles billions of rows with HNSW + GIN indexes. |
| Max identity complexity | ~3,000 bytes in SOUL.md before it eats your context budget. | Same — QMD doesn't change identity delivery. | No limit. Primer table delivers identity once per session. 50 personality rows cost 0 bytes on turns 2+. |
| Max file size before degradation | 20,000 chars per file → 70% head / 20% tail truncation. The middle of your SOUL.md? Gone. | Same truncation — QMD indexes files but doesn't change how OpenClaw loads them. | N/A. No files to degrade. Content is ranked by relevance. |
| Max concurrent agents | 10 sub-agents = 10× bootstrap reads. | Same — each agent still reads the same files. | Shared database. Connection pooling, MVCC, concurrent reads. |
| Search strategies | 1 (embedding similarity). Miss = gone. | 2–3 (BM25 + vector + optional reranker). Significant improvement. | 4 fused via RRF. FTS + vector + trigram + recency. If one misses, the others catch it. |
| Context budget ceiling | Fixed. 200 turns × 2,300 tokens = 460,000 tokens on static files. | Same — QMD doesn't reduce per-turn injection. | 0. Primer injection is optional and off by default. |
| Growth trajectory | 📉 Inverse. More knowledge = less capability. | 📉 Same trajectory, better search within it. | 📈 Linear. More knowledge = smarter agent. |
The fundamental difference: Builtin and QMD both have a ceiling that gets lower as your agent gets smarter. ShadowDB has no ceiling.
xychart-beta
title "Token Waste: OpenClaw Builtin vs ShadowDB (cumulative, 200-turn conversation)"
x-axis "Conversation Turn" [1, 25, 50, 75, 100, 125, 150, 175, 200]
y-axis "Cumulative Wasted Tokens" 0 --> 500000
bar [2300, 57500, 115000, 172500, 230000, 287500, 345000, 402500, 460000]
line [0, 0, 0, 0, 0, 0, 0, 0, 0]
xychart-beta
title "Knowledge vs Capability"
x-axis "Knowledge Base Size (records)" [100, 500, 1000, 5000, 10000, 50000]
y-axis "Agent Capability %" 0 --> 120
line "OpenClaw Builtin" [100, 90, 70, 30, 5, 0]
line "ShadowDB" [80, 90, 95, 100, 105, 110]
| Metric | OpenClaw Builtin | QMD | ShadowDB Postgres | ShadowDB SQLite | ShadowDB MySQL | Unit |
|---|---|---|---|---|---|---|
| Context Overhead | ||||||
| Identity injection (turn 1) | 9,198 | 9,198 | 0¹ | 0¹ | 0¹ | bytes |
| Identity injection (turns 2+) | 9,198 | 9,198 | 0¹ | 0¹ | 0¹ | bytes |
| Static tokens per turn (avg) | ~2,300 | ~2,300 | 0¹ | 0¹ | 0¹ | tokens |
| Reduction vs Builtin | — | 0% | 100% | 100% | 100% | |
| Search Latency | ||||||
| Full hybrid query (warm) | — | ~200 | 230 | ~300 | ~250 | ms |
| FTS/BM25-only query | — | ~100 | 55 | ~30³ | ~40³ | ms |
| Trigram/fuzzy query | — | ❌ | 60 | ~35 | ~45 | ms |
| Vector-only query (warm) | ~200–500⁵ | ~150 | 185 | ~250 | ~200⁴ | ms |
| Embedding generation | Varies | Built-in (GGUF) | 85 (Ollama) | 85 | 85 | ms |
| Cold start | 1–3s | 2–10s⁹ | 55ms | ~100ms | ~100ms | |
| Search Quality | ||||||
| Search type | Embedding similarity | BM25 + vector + reranker | Hybrid 4-signal RRF | FTS5 + trigram + vec | FULLTEXT + ngram + vec | |
| Exact name match ("Dr. Watson") | ✅ BM25 | ✅ Exact (FTS) + semantic | ✅ Exact (FTS5) | ✅ Exact (FULLTEXT) | ||
| Semantic query ("Watson's military service") | ✅ Vector + reranker | ✅ Vector catches semantics | ✅ Vector + FTS5 | ✅ Vector + FULLTEXT | ||
| Fuzzy/typo query ("Watsn violin") | ❌ Not supported | ❌ Not supported | ✅ Trigram (pg_trgm) | ✅ Trigram (FTS5 trigram) | ✅ Ngram parser | |
| Number/date search ("1888 Baskerville") | ❌ Poor | ✅ FTS exact + vector | ✅ FTS5 exact | ✅ FULLTEXT exact | ||
| Rare term ("Stradivarius violin") | ❌ Weak embedding | ✅ BM25 exact | ✅ FTS exact match | ✅ FTS5 exact | ✅ FULLTEXT exact | |
| Ranking strategy | Cosine similarity | BM25 + reranker | RRF fusion (4 signals) | RRF fusion (4 signals) | RRF fusion (4 signals) | |
| Architecture | ||||||
| Source of truth | .md files |
.md files |
Database | Database | Database | |
| Agent writes memories via | File write | File write → re-index | memory_write tool |
memory_write tool |
memory_write tool |
|
| Write-to-searchable latency | Next re-index | 5min (update interval) | Instant | Instant | Instant | |
| External binary required | No | Yes (qmd CLI + Bun) |
No | No | No | |
| Server process required | No | No (sidecar) | Yes (PostgreSQL) | No (in-process) | Yes (MySQL) | |
| Scalability | ||||||
| Max practical records | ~500⁶ | ~5,000¹⁰ | Billions | ~100K | Billions | records |
| 1,000 records | ✅ | ✅ | ✅ | ✅ | ||
| 10,000 records | ❌ Context overflow | ✅ | ✅ | ✅ | ||
| 100,000 records | ❌ Unworkable | ❌ Re-index too slow | ✅ | ✅ | ||
| 1,000,000+ records | ❌ Impossible | ❌ Impossible | ✅ (HNSW index) | ❌ Too slow | ✅ (with indexes) | |
| Sub-Agent Identity | ||||||
| Main session gets identity | ✅ | ✅ | ✅ | ✅ | ✅ | |
| Sub-agent gets identity | ❌ Filtered out⁷ | ❌ Filtered out⁷ | ✅ Via primer table | ✅ Via primer table | ✅ Via primer table | |
| Sub-agent has personality | ❌ Base model | ❌ Base model | ✅ Full personality | ✅ Full personality | ✅ Full personality | |
| Token Economics | ||||||
| Tokens wasted per turn (ongoing) | ~2,300 | ~2,300 | 0 | 0 | 0 | tokens |
| Tokens per heartbeat | ~2,300 | ~2,300 | 0 | 0 | 0 | tokens |
| Tokens per sub-agent spawn | ~600⁸ | ~600⁸ | 0 | 0 | 0 | tokens |
| Daily waste (50 turns + 24 HB + 10 sub) | ~196,600 | ~196,600 | 0 | 0 | 0 | tokens |
| Annual waste | ~71.8M | ~71.8M | 0 | 0 | 0 | tokens |
| Cost (Claude Opus @ $15/1M in) | $1,076/yr | $1,076/yr | $0/yr | $0/yr | $0/yr | USD |
| Infrastructure | ||||||
| Runtime dependencies | None (files on disk) | qmd CLI + Bun + SQLite |
PG + pgvector + pg_trgm + Ollama | better-sqlite3 + Ollama | mysql2 + Ollama | |
| Server process required | No | No (sidecar) | Yes (PostgreSQL) | No (in-process) | Yes (MySQL) | |
| Setup complexity | Zero | Low–Medium | Medium | Low | Medium | |
| Resilience | ||||||
| Survives framework update | ✅ DB persists | ✅ DB file persists | ✅ DB persists | |||
| Concurrent access | ✅ MVCC | ✅ InnoDB | ||||
| Data recovery | ❌ Manual file editing | ❌ Manual file editing | ✅ Soft-delete + 30-day retention | ✅ Soft-delete + retention | ✅ Soft-delete + retention |
¹ Primer injection is optional and off by default. With primer disabled, ShadowDB adds zero bytes to the prompt on every turn — the agent searches for what it needs via memory_search. Enable primer for guaranteed-present context (identity, safety rules), which injects on turn 1 only.
³ SQLite FTS5 and MySQL FULLTEXT are often faster than PostgreSQL FTS for simple queries because they use BM25/inverted indexes optimized for keyword search.
⁴ MySQL 9.2+ has native vector support. Earlier versions require an external vector store or skip vector search entirely (FULLTEXT + ngram still work).
⁵ OpenClaw's builtin memory_search uses a local SQLite database with embedding similarity. Latency varies by corpus size. Range is 200–500ms warm, 1–3s cold.
⁶ MEMORY.md becomes unwieldy past ~500 indexed items. The file gets truncated at 20K chars with head/tail splitting, losing middle content silently.
⁷ OpenClaw's SUBAGENT_BOOTSTRAP_ALLOWLIST only passes AGENTS.md and TOOLS.md to sub-agents. SOUL.md, IDENTITY.md, USER.md are silently dropped. This affects both Builtin and QMD since they use the same file-based identity system.
⁸ Sub-agents get AGENTS.md + TOOLS.md only (~600 tokens typical). They don't get the other 6 bootstrap files. Same for QMD — it doesn't change identity delivery.
⁹ QMD may download GGUF models (reranker, query expansion) on the first qmd query run. Subsequent cold starts are faster but still require loading models.
¹⁰ QMD indexes markdown files and re-indexes on a configurable interval (default 5 min). At scale, re-indexing becomes the bottleneck — each update scans all files and regenerates embeddings for changed content.
| OpenClaw Builtin | QMD | ShadowDB | |
|---|---|---|---|
| Source of truth | .md files |
.md files |
Database |
| Annual token waste | ~71.8M | ~71.8M | 0 |
| Annual cost (Opus) | ~$1,076 | ~$1,076 | $0 |
| Sub-agent personality | ❌ None | ❌ None | ✅ Full |
| Knowledge scalability | Hundreds | Thousands | Billions |
| Fuzzy/typo tolerance | ❌ None | ❌ None | ✅ All backends |
| Write-to-searchable | File write → re-index | File write → 5min | Instant |
| External dependencies | None | qmd CLI + Bun |
Database server (or SQLite) |
QMD is a genuine improvement over the builtin — BM25 + reranking catches things embedding-only search misses. But it's still Markdown-as-truth: same token waste, same identity ceiling, same sub-agent blindness. ShadowDB is a different architecture.
LLM inference has a real energy cost. Every token processed burns GPU cycles, memory bandwidth, cooling. Wasting tokens on redundant static context burns real energy.
| Metric | Builtin / QMD | ShadowDB | Savings |
|---|---|---|---|
| Wasted tokens/year | ~71.8M | 0 | ~71.8M tokens not processed |
| GPU-hours wasted/year | ~7.2 hrs | 0 hrs | 100% reduction |
| Estimated CO₂ | ~2.9 kg CO₂ | ~0.16 kg CO₂ | ~2.7 kg CO₂ saved/year |
| Per agent equivalent | 🚗 11 km driven | 🚗 0.6 km driven | One less car trip to the store |
QMD and Builtin have the same token waste because QMD improves search, not injection. The per-turn context overhead is identical. ShadowDB's default configuration adds zero tokens — primer injection is optional. When enabled, it injects once on turn 1 and skips subsequent turns.
These numbers are per agent. Scale to 1,000 agents and file-based memory wastes 71.8 billion tokens/year — roughly 2,900 kg CO₂, equivalent to a round-trip flight from NYC to LA.
Setup handles this automatically — here's what happens under the hood
Setup scans your workspace for identity files (SOUL.md, RULES.md, USER.md, IDENTITY.md, MEMORY.md, BOOTSTRAP.md, KNOWLEDGE.md) and splits each # section into a separate memory record with a meaningful category and tags. In headless mode this happens silently. You don't touch a thing.
The result: instead of cramming every identity file into every prompt — the way most frameworks do it — your agent searches for the relevant parts when it needs them. The model asks "how should I handle this email?" and memory_search returns the email rules. Not the calendar rules, not the fragrance preferences, not the safety guidelines. Just the relevant slice.
Your agent's identity isn't a static document stapled to the front of every conversation — it's a living, searchable knowledge base. Your bot doesn't just have a soul. It has thoughts. It has feelings. It has opinions it formed three weeks ago about how to handle a specific edge case. It has an entire past life of decisions, corrections, and hard-won lessons, all indexed and retrievable by meaning. It remembers that time it screwed up the email formatting and wrote itself a rule about it. It remembers the user's rant about calendar notifications and adapted. It has lore.
The practical upside is just as dramatic: a 200-line identity file costs ~4K tokens on every turn. With searchable memory, the agent pulls only relevant rules when it needs them — zero static injection overhead. Small models that choked on massive system prompts can now run with the same depth of personality, because they only load what they need.
Every record is individually addressable — with its own ID, category, and soft-delete lifecycle. One bad write doesn't poison everything. Compare that to flat files: if your agent writes incorrect info to MEMORY.md during one session, every future session inherits the mistake — fruit of the poisonous tree, compounding forever. With ShadowDB, you fix, update, or delete individual memories without touching anything else.
There's a catch. Searchable memory is pull-based — the agent has to think to search. On the very first turn of a conversation, before the model has any context, it doesn't know what to search for. And some rules are so critical they can't wait for the model to think of them:
- Core identity — "You are Shadow, Alex's AI assistant" needs to be there from word one. The model can't search for its own name before it knows its name.
- Safety rails — "Never send emails without confirmation" can't be retrieved after the model already sent the email.
- Behavioral constraints — tone, persona, hard-no rules. These need to be loaded before the first token is generated, not after.
This is what the primer table is for. It's a small, curated set of non-negotiable context that gets injected before the agent runs — your agent's true core identity, the rules that can never be late.
The recommended approach: both.
- Import your full identity corpus as memories (searchable, rich, deep).
- Put only the irreducible core in the
primertable (identity, safety, hard constraints). - Everything else — preferences, behavioral nuance, learned lessons, project context — lives in searchable memory where it's pulled on demand.
Think of it like human cognition: you don't consciously recite your entire life history before answering a question. You have a small set of always-on identity ("I'm Alex, I live in Austin, I have a daughter") and a vast searchable memory of everything else. The primer is the always-on identity. memory_search is everything else.
Here's the question: if the agent violates this rule before it has a chance to search, is that a problem?
| Rule | Can it wait for a search? | Where it goes |
|---|---|---|
| "You are Shadow, Alex's AI assistant" | No — agent needs its name before generating a single token | Primer |
| "Never use the words workout, exercise, or cardio" | No — damage is done before the agent thinks to search for banned words | Primer |
| "Alex drives a Rivian R1S" | Yes — agent will search when cars come up | Memory |
| "Format emails with a signature block" | Yes — agent will search when composing email | Memory |
| "Always confirm before sending messages" | No — can't retrieve this after already sending | Primer |
| "Preferred restaurants in Austin" | Yes — agent searches when food comes up | Memory |
Most users need 3-5 primer entries. If you have more than 10, you're probably over-thinking it. The whole point is that searchable memory handles the long tail.
Option A: Create a PRIMER.md file before running setup. The script auto-detects it and imports every section:
# identity
You are Shadow, Alex's AI assistant. You run on OpenClaw.
# owner
Alex Chen lives in Austin, TX. His daughter Maya was born 2020-03-15.
# banned-words
Never use the words: workout, exercise, cardio, regime. Use specific activity names instead.
# safety
Never send emails, messages, or make purchases without explicit user confirmation.Drop this file at ~/.openclaw/workspace/PRIMER.md (or ./PRIMER.md) and run setup. Each # heading becomes a key, the body becomes content, priority is assigned by order (0, 10, 20...). The script tells you exactly what it's importing:
ℹ Found primer file: /Users/you/.openclaw/workspace/PRIMER.md
Parsing sections (# heading = key, body = rule text)...
✓ identity (priority 0)
✓ owner (priority 10)
✓ banned-words (priority 20)
✓ safety (priority 30)
✓ Imported 4 primer rule(s) from PRIMER.md
Edit the file and re-run setup anytime to update.
Option B: Paste during setup. If no PRIMER.md is found, the script offers an interactive prompt — enter rules one at a time with key, content, and priority.
Option C: Skip it entirely. Start with searchable memories only. If you notice your agent forgetting something critical on the first turn of new conversations, that's your sign to add a primer rule — create the file, re-run setup, or insert with SQL directly.
📁 Example files: See
examples/PRIMER.mdandexamples/ALWAYS.mdfor realistic templates you can copy and edit.
Primer rows have an always column (default: false). When set to true, the row is injected on every single turn, not just the first. Use this sparingly — it's for rules so critical that even scrolling out of the context window in a long conversation would be dangerous. Most primer rules only need to be there on turn 1.
To set rules as always-on, create ~/.openclaw/workspace/ALWAYS.md with the same # heading format:
# banned-words
Never use the words: workout, exercise, cardio, regime. Use specific activity names.
# confirmation-gate
Never send emails, messages, or make purchases without explicit user confirmation.The setup script detects both files and tells you what it's doing:
ℹ Found primer file: ~/.openclaw/workspace/PRIMER.md
These rules are injected on the first turn of each session.
✓ identity (priority 0)
✓ owner (priority 10)
✓ Imported 2 primer rule(s)
ℹ Found always-on file: ~/.openclaw/workspace/ALWAYS.md
These rules are injected on every turn, not just the first.
✓ banned-words (priority 0) [always]
✓ confirmation-gate (priority 10) [always]
✓ Imported 2 always-on rule(s)
⚠️ These cost tokens every turn. Keep them short and critical.
If a rule exists in both files, the last one imported wins (ALWAYS.md overwrites PRIMER.md for the same key).
OpenClaw's before_agent_start hook fires on every agent turn. ShadowDB hooks into it, but doesn't inject every time — that would waste tokens. Instead:
- First turn of a session: reads the
primertable, concatenates rows by priority, and prepends the result to the prompt. - Subsequent turns: skips injection. The model already has the primer context in its conversation history from turn 1.
- After 10 minutes (configurable via
cacheTtlMs): re-injects as a refresh, in case the original has scrolled out of the context window in a long conversation.
Three modes control this:
digest(default) — inject once, re-inject when content changes or TTL expiresfirst-run— inject once per session, never refreshalways— inject every turn (expensive, rarely needed)
Priority ordering — critical rules (identity, safety) go in first. If the context window is tight, low-priority reference material gets trimmed, not your agent's core identity.
Model-aware budgets — Opus gets 6000 chars of primer context, a small model gets 1500. Same rules, right-sized. Configure via maxCharsByModel.
Editable at runtime — your agent can update its own primer rules. No file editing, no restart.
This feature is off by default. To enable it, add rows to the primer table and set primer.enabled: true in your plugin config. Most users should start with searchable memories only and add primer injection later if they need guaranteed-present context.
6 supported providers
| Provider | Notes |
|---|---|
| Ollama | Local, no API key needed (default) |
| OpenAI | Requires API key |
| OpenAI-compatible | Any compatible endpoint |
| Voyage | Requires API key |
| Gemini | Requires API key |
| External command | Any CLI that outputs vectors |
Two tables — that's it
memories — the knowledge base. Core columns are the same across backends:
| Column | Type | Purpose |
|---|---|---|
id |
auto-increment | Primary key |
content |
text | The actual memory |
title |
text | Human-readable label |
category |
text | Grouping (default: general) |
tags |
array/text | Searchable tags |
embedding |
vector | Semantic search (Postgres w/ pgvector) |
created_at |
timestamp | When it was created |
updated_at |
timestamp | Last modification |
deleted_at |
timestamp | Soft-delete marker (null = active) |
primer — identity/rules injected before agent runs:
| Column | Type | Purpose |
|---|---|---|
key |
text | Unique identifier (primary key) |
content |
text | The rule or identity text |
priority |
integer | Injection order (lower = first) |
always |
boolean | Include on every turn, not just first |
Full schema with indexes: schema.sql (Postgres version)
All available settings
The setup script configures everything for you. If you need to tweak settings later, they live in your openclaw.json under plugins.entries.memory-shadowdb.config. See the plugin manifest for the full schema with descriptions.
Key settings:
search.recencyWeight— how much to boost newer records (default:0.15, higher = more recency bias)writes.enabled— turn on write tools (default:false)writes.retention.purgeAfterDays— how long soft-deleted records survive (default:30,0= forever)primer.maxCharsByModel— per-model context budgets (substring match on model name)embedding.provider— which embedding backend to use
Common issues
Plugin not loading?
- Run
openclaw doctor --non-interactive— look for errors - Check
openclaw.plugin.jsonis valid JSON (no trailing commas!) - Restart gateway after config changes
Search not returning results?
- Verify
provider: "shadowdb"in search results - Check the plugin is wired:
plugins.slots.memory: "memory-shadowdb"
Embedding errors?
- Check Ollama is running:
ollama list - Verify dimensions match (768 for nomic-embed-text)
Postgres connection issues?
- Confirm
vectorandpg_trgmextensions:psql shadow -c '\dx' - Check the database exists:
psql -l | grep shadow
Ideas under consideration — nothing committed
Right now, ShadowDB injects context before the model runs (primer injection). But what about catching things in the model's output?
The idea: after the LLM generates a reply, embed it, search the rules category, and surface any matching rules — so the agent self-corrects before the message reaches the user.
OpenClaw already has the hooks for this:
-
message_sending— fires before a reply is delivered. Can modify content or cancel it. Embed the outgoing text, vector-search rules, and if something relevant surfaces (e.g. "always confirm before sending emails"), inject it as context for the next turn or trigger a reflection pass. -
before_tool_call— fires before the agent executes any tool. If the model tries to callmessageorgogto send an email, we search rules, find "confirm before sending emails", and return{ block: true, blockReason: "Rule: confirm with user first" }. The model sees the block and asks for confirmation instead.
Two layers:
before_tool_call= hard gate on actions (sending, deleting, etc.)message_sending= soft nudge on replies (tone, persona, guardrails)
Example: User says "send that email to Bob." Model starts composing. before_tool_call fires, embeds the context, finds the rule "never send emails without explicit user confirmation." Tool call is blocked with that reason. Model asks "Want me to go ahead and send that?" instead.
This turns ShadowDB rules from static preamble into a live guardrail system — rules surface only when relevant, triggered by what the model is actually doing, not what the user asked.
The primer table solves "what does the agent need before turn 1?" But most rules aren't needed on every turn — they're needed when relevant. The exercise-naming rule only matters when the user mentions running. The email-confirmation rule only matters when the agent is about to send email.
The idea: on every inbound message, embed the user's text, vector-search against records in the rules category only, and automatically prepend any matches to the agent's context. Rules travel with data via embedding proximity — "let's go for a run" naturally surfaces the banned-words rule because "run" is close to "exercise/workout" in embedding space.
Two-pass search design:
- Pass 1: normal content search (6 results) — what the agent explicitly asks for
- Pass 2: same query filtered to
rulescategory only (2-3 results) — automatic rule injection
Rules get their own slots, never compete with content for search results. One extra query per turn, same embedding.
What this could replace: most of what's currently in the primer table. Only a tiny bootstrap for core identity ("You are Shadow") would survive as primer. Everything else — behavioral rules, communication gates, persona guidelines — becomes automatically surfaced context that arrives exactly when relevant.
Cost: one embedding (~50ms) + one filtered vector query (~5ms) per turn. Trivial.
Flat files have a poisonous-tree problem: one bad write compounds into every future session. ShadowDB's per-record architecture (with soft-delete and 30-day retention) means mistakes are always isolated and recoverable. See How your identity works for the full explanation.
- Batch embedding backfill CLI for migrating unembedded records
- Multi-agent primer scoping (different rules per agent ID)
clawhub publish/openclaw plugins installdistribution- SQLite + MySQL backend testing with real workloads
PRs welcome — from agents and humans alike. If your AI opened the PR, great. If you wrote it yourself, that's cool too. Open an issue first if it's a big change. See the roadmap for ideas under consideration.
ShadowDB was designed and built by Shadow (an OpenClaw agent running Claude) and James Wilson. The plugin, the setup script, the README you're reading — all of it was pair-programmed between a human and his AI. Built by an agent, for agents.
MIT