Two pgvector tables hold what the persona remembers about you. They serve different recall needs and are queried separately.
| Layer | instance_id |
What it holds | Lifetime |
|---|---|---|---|
| Profile | NULL |
Cross-session facts about the user — things any persona could know. | Permanent |
| Relationship | <uuid> |
Per-session callbacks — the small things this specific persona shared with this user. | Per session |
The distinction matters because persona stability across personas is different from intimacy within a relationship. If you tell Aria you're allergic to peanuts, that's a profile fact — Kenji should know it too. If Aria mentioned she's reading Bishop tonight, that's a relationship memory — Kenji shouldn't pretend to know.
Single table, two layers distinguished by instance_id:
CREATE TABLE engine.companion_memories (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
session_id UUID NOT NULL REFERENCES engine.chat_sessions(id) ON DELETE CASCADE,
user_id UUID NOT NULL,
instance_id UUID, -- NULL = profile layer
content TEXT NOT NULL,
embedding VECTOR(512) NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);Two filtered indexes — one per layer — keep retrieval cheap on the hot path:
CREATE INDEX idx_memories_user_profile
ON engine.companion_memories(user_id)
WHERE instance_id IS NULL;
CREATE INDEX idx_memories_session
ON engine.companion_memories(session_id)
WHERE instance_id IS NOT NULL;voyage-3-lite via Voyage's API. 512 dimensions, multilingual, ~$0.02 per 1M input tokens.
// crates/eros-engine-llm/src/voyage.rs
pub async fn embed_document(&self, text: &str) -> Result<Vec<f32>, LlmError>;
pub async fn embed_query(&self, text: &str) -> Result<Vec<f32>, LlmError>;embed_document and embed_query use different input_type hints to Voyage — documents get optimised for storage retrieval, queries for cosine match. This is why the engine has both methods and not just one.
The engine fails loud on empty VOYAGE_API_KEY — boot refuses if the secret is missing. The closed-source eros-gateway has a known regression where an empty key silently disables embeddings; eros-engine declined to inherit that.
Cosine similarity via pgvector's <=> operator with an IVFFlat index:
CREATE INDEX idx_memories_embedding
ON engine.companion_memories
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);Profile-layer search:
SELECT id, content, 1 - (embedding <=> $2::vector) AS similarity
FROM engine.companion_memories
WHERE user_id = $1 AND instance_id IS NULL
ORDER BY embedding <=> $2::vector
LIMIT $3;Relationship-layer search adds instance_id = $4. The 1 - distance lets you sort or threshold on similarity directly without remembering pgvector's distance-not-similarity convention.
lists = 100 is a balanced default for small-to-medium tables (≲ 1M rows). Tune up for larger corpuses (rule of thumb: lists ≈ √rows).
Post-process inserts memories during the background phase of every turn. Two paths:
- Insight extraction — the LLM identifies factual nuggets ("user mentioned they're a librarian"). These go into the profile layer (
instance_id = NULL). - Relationship moments — anything specific to this session (a callback the persona made, a small confession). Goes into the relationship layer.
Embeddings are NOT generated for every message — only the ones the insight extractor surfaces as worth remembering. Volume stays modest.
Raw chat messages live in engine.chat_messages (full transcript, plain text). They are not embedded. The memory tables hold summaries and facts, not the full message log. If you want to retrieve the actual transcript, query chat_messages directly — that's the source of truth for what was said.
Memory is read back into the prompt on each chat turn, gated by the per-request
memory_scope (values and default in api-reference.md;
default neutral_and_relationship). The reply handler (pipeline::handlers)
builds a profile/relationship context block from two sources:
- Profile layer — the structured
companion_insightsJSONB, rendered as profile bullets.memory_scopedecides whether intimate fields are included (full/insights_only) or only the neutral subset (neutral_*). - Relationship layer —
companion_memoriesrows pulled by semantic (embedding) similarity search against the current turn, included when the scope keeps relationship memory (full/neutral_and_relationship/relationship_only).
memory_scope = none skips memory injection entirely. The frontend's
/comp/user/{user_id}/profile endpoint returns the same companion_insights
JSONB as a human-readable view of what's been collected.
crates/eros-engine-store/src/memory.rs—MemoryRepo(upsert + search, 3 sqlx::test integration tests)crates/eros-engine-llm/src/voyage.rs— embedding clientcrates/eros-engine-server/src/pipeline/post_process.rs— write pathcrates/eros-engine-store/migrations/0003_memory.sql— schema + index DDL