What you're trying to do
gbrain extracts entity candidates from the live conversation context and prefetches matching pages into the prompt window without the agent having to ask. Vouch's kb.context is caller-driven — the agent has to know to query. For a long-running session, that means context drift the agent can't self-correct.
Add a server-side reflex: scan the calling tool's recent params (the last N query / task / text strings), extract entity candidates by FTS-matching against entities/, attach the top-K matched claims to _meta.vouch_salience on the next read response. Zero LLM calls — pure substring + FTS5.
Suggested shape
# Configurable in .vouch/config.yaml
retrieval:
reflex:
enabled: true
window: 8 # last 8 caller queries
top_k: 3 # surface up to 3 salient entities
- New
src/vouch/salience.py keeps a per-session ring buffer of recent caller strings (in-memory, never on disk).
- Every read-side response from a session that has a buffer gets
_meta.vouch_salience: [{entity_id, claim_count, top_claim_id}].
- Resets on
kb.session_end or after 30 minutes of inactivity.
- Disabled in stateless callers (no
session_id) and opt-out via config.
Acceptance
- A session that searches "jwt" three times in a row gets
_meta.vouch_salience highlighting the jwt entity on the next call.
- Salience computation runs in <50ms p95 on a 1000-entity KB.
- Disabling via config removes the field entirely.
Out of scope
- LLM-mediated entity extraction — substring + FTS only in v1.
- Cross-session salience — per-session only.
What you're trying to do
gbrain extracts entity candidates from the live conversation context and prefetches matching pages into the prompt window without the agent having to ask. Vouch's
kb.contextis caller-driven — the agent has to know to query. For a long-running session, that means context drift the agent can't self-correct.Add a server-side reflex: scan the calling tool's recent params (the last N
query/task/textstrings), extract entity candidates by FTS-matching againstentities/, attach the top-K matched claims to_meta.vouch_salienceon the next read response. Zero LLM calls — pure substring + FTS5.Suggested shape
src/vouch/salience.pykeeps a per-session ring buffer of recent caller strings (in-memory, never on disk)._meta.vouch_salience: [{entity_id, claim_count, top_claim_id}].kb.session_endor after 30 minutes of inactivity.session_id) and opt-out via config.Acceptance
_meta.vouch_saliencehighlighting thejwtentity on the next call.Out of scope