-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Milestone
Description
Summary
When generating a reply, run a filtered vector search on conversation-memory to retrieve relevant older context and include hits in the prompt.
Acceptance Criteria
- Chat flow performs vector search for current user message (and optionally last 1-2 turns)
- Search filters by
conversation_idanduser_id(no cross-convo leakage) - Top-K and min_score configurable (default
top_k=8,min_score=0.0) - Retrieved snippets de-duplicated with recent messages window (don't re-insert same content)
- Prompt template includes a framed "Retrieved memory" section
- Unit/integration test verifies retrieval + inclusion in prompt (mock vector store)
Implementation Notes
Modify prompt-building code path in AIService.chat() (where recent_messages = conversation.messages[-10:]) to:
- Call
rag.search(query=user_message, filter={"conversation_id": id}) - Dedupe hits against
recent_messages(bymessage_id) - Insert retrieved hits in a "Retrieved memory" block before the user message
Keep the last-N message window unchanged (memory is augmentation, not replacement).
Files to Change
app/services/ai/ai_service.py- Or wherever chat prompt is builtapp/services/ai/rag.py- Search helper for conversation memorytests/services/ai/test_chat_rag.py- Integration tests
Dependencies
- Requires Index conversation messages into conversation-memory collection #419 (Index conversation messages) to be completed first
Estimate
1-2 days dev + testing
Metadata
Metadata
Assignees
Labels
No labels