Hi team,
We’re integrating a Chat Agent with EverMemos / MemSys and want to confirm whether the approach below matches what you recommend, or if the project prefers a different pattern (we noticed demos and ChatSession don’t fully align, so we’d like to align with maintainer intent).
1. Short-term context
Keep only the last n turns (or truncate to a token budget) in the LLM messages as working memory.
2. Long-term memory ingestion
After each turn (or at the end of a turn), send user / assistant messages (and tool / tool_calls where applicable) via POST /api/v1/memories/agent, and call /memories/agent/flush when we need to trigger extraction.
3. Retrieval (including time range)
When recall is needed, have the agent call search (e.g. as a tool), using POST /api/v1/memories/search and choosing memory_types as appropriate (e.g. episodic_memory, profile, agent_memory, …).
Time constraints belong in retrieval: for natural-language time references (“what we discussed yesterday about X”), apply timestamp range filters in filters (e.g. gte / lt) on the search request, resolving the spoken time window to concrete bounds alongside the topical query—rather than handling “time” only at write/ingest time.
This differs from ChatSession, which retrieves every turn before building the prompt—we prefer on-demand search via a tool. Is that considered a valid integration style, or do you recommend per-turn automatic RAG instead?
Also: is there any guidance on whether an episodic item’s timestamp always matches “when the conversation happened,” or known caveats?
4. Official tool / function definitions
If on-demand search aligns with your direction, do you (or will you) ship an official tool / function definition (e.g. OpenAI-style tools JSON Schema, or an MCP tool list) that covers memory search only—including fields aligned with POST /api/v1/memories/search such as query, memory_types, and retrieval-side filters / time range? If it already exists, please point us to the doc or code path. If not, are you open to a community PR with a reference implementation?
If any of the above conflicts with the intended design, or there’s a doc section we should treat as canonical, please point us to it so we can adjust our integration.
Thanks.
Hi team,
We’re integrating a Chat Agent with EverMemos / MemSys and want to confirm whether the approach below matches what you recommend, or if the project prefers a different pattern (we noticed demos and
ChatSessiondon’t fully align, so we’d like to align with maintainer intent).1. Short-term context
Keep only the last n turns (or truncate to a token budget) in the LLM
messagesas working memory.2. Long-term memory ingestion
After each turn (or at the end of a turn), send user / assistant messages (and tool /
tool_callswhere applicable) viaPOST /api/v1/memories/agent, and call/memories/agent/flushwhen we need to trigger extraction.3. Retrieval (including time range)
When recall is needed, have the agent call search (e.g. as a tool), using
POST /api/v1/memories/searchand choosingmemory_typesas appropriate (e.g.episodic_memory,profile,agent_memory, …).Time constraints belong in retrieval: for natural-language time references (“what we discussed yesterday about X”), apply
timestamprange filters infilters(e.g.gte/lt) on the search request, resolving the spoken time window to concrete bounds alongside the topicalquery—rather than handling “time” only at write/ingest time.This differs from
ChatSession, which retrieves every turn before building the prompt—we prefer on-demand search via a tool. Is that considered a valid integration style, or do you recommend per-turn automatic RAG instead?Also: is there any guidance on whether an episodic item’s
timestampalways matches “when the conversation happened,” or known caveats?4. Official tool / function definitions
If on-demand search aligns with your direction, do you (or will you) ship an official tool / function definition (e.g. OpenAI-style
toolsJSON Schema, or an MCP tool list) that covers memory search only—including fields aligned withPOST /api/v1/memories/searchsuch asquery,memory_types, and retrieval-sidefilters/ time range? If it already exists, please point us to the doc or code path. If not, are you open to a community PR with a reference implementation?If any of the above conflicts with the intended design, or there’s a doc section we should treat as canonical, please point us to it so we can adjust our integration.
Thanks.