Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,15 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
## [Unreleased]

### Added
- `MessageMetadata` struct in `zeph-llm` with `agent_visible`, `user_visible`, `compacted_at` fields; default is both-visible for backward compat (#M28)
- `Message.metadata` field with `#[serde(default)]` — existing serialized messages deserialize without change
- SQLite migration `013_message_metadata.sql` — adds `agent_visible`, `user_visible`, `compacted_at` columns to `messages` table
- `save_message_with_metadata()` in `SqliteStore` for saving messages with explicit visibility flags
- `load_history_filtered()` in `SqliteStore` — SQL-level filtering by `agent_visible` / `user_visible`
- `replace_conversation()` in `SqliteStore` — atomic compaction: marks originals `user_only`, inserts summary as `agent_only`
- `oldest_message_ids()` in `SqliteStore` — returns N oldest message IDs for a conversation
- `Agent.load_history()` now loads only `agent_visible=true` messages, excluding compacted originals
- `compact_context()` persists compaction atomically via `replace_conversation()`, falling back to legacy summary storage if DB IDs are unavailable
- Multi-session ACP support with configurable `max_sessions` (default 4) and LRU eviction of idle sessions (#781)
- `session_idle_timeout_secs` config for automatic session cleanup (default 30 min) with background reaper task (#781)
- `ZEPH_ACP_MAX_SESSIONS` and `ZEPH_ACP_SESSION_IDLE_TIMEOUT_SECS` env overrides (#781)
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ zeph --tui # run with TUI dashboard
|---|---|
| **Hybrid inference** | Ollama, Claude, OpenAI, Candle (GGUF), any OpenAI-compatible API. Multi-model orchestrator with fallback chains. Response cache with blake3 hashing and TTL |
| **Skills-first architecture** | YAML+Markdown skill files with semantic matching, self-learning evolution, 4-tier trust model, and compact prompt mode for small-context models |
| **Semantic memory** | SQLite + Qdrant (or embedded SQLite vector search) with MMR re-ranking, temporal decay scoring, adaptive chunked compaction, credential scrubbing, cross-session recall, vector retrieval, autosave assistant responses, and snapshot export/import |
| **Semantic memory** | SQLite + Qdrant (or embedded SQLite vector search) with MMR re-ranking, temporal decay scoring, adaptive chunked compaction, durable compaction with message visibility control, credential scrubbing, cross-session recall, vector retrieval, autosave assistant responses, and snapshot export/import |
| **Multi-channel I/O** | CLI, Telegram, Discord, Slack, TUI — all with streaming. Vision and speech-to-text input |
| **Protocols** | MCP client (stdio + HTTP), A2A agent-to-agent communication, ACP server for IDE integration (multi-session, persistence, idle reaper), sub-agent orchestration |
| **Defense-in-depth** | Shell sandbox, tool permissions, secret redaction, SSRF protection, skill trust quarantine, audit logging |
Expand Down
2 changes: 1 addition & 1 deletion crates/zeph-core/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Core orchestration crate for the Zeph agent. Manages the main agent loop, bootst
| `bootstrap` | `AppBuilder` — fluent builder for application startup |
| `channel` | `Channel` trait defining I/O adapters; `LoopbackChannel` / `LoopbackHandle` for headless daemon I/O (`LoopbackHandle` exposes `cancel_signal: Arc<Notify>` for session cancellation); `Attachment` / `AttachmentKind` for multimodal inputs |
| `config` | TOML config with `ZEPH_*` env overrides; typed `ConfigError` (Io, Parse, Validation, Vault) |
| `context` | LLM context assembly from history, skills, memory; adaptive chunked compaction with parallel summarization; uses shared `Arc<TokenCounter>` for accurate tiktoken-based budget tracking |
| `context` | LLM context assembly from history, skills, memory; adaptive chunked compaction with parallel summarization; visibility-aware history loading (agent-only vs user-visible messages); durable compaction via `replace_conversation()`; uses shared `Arc<TokenCounter>` for accurate tiktoken-based budget tracking |
| `cost` | Token cost tracking and budgeting |
| `daemon` | Background daemon mode with PID file lifecycle (optional feature) |
| `metrics` | Runtime metrics collection |
Expand Down
78 changes: 72 additions & 6 deletions crates/zeph-core/src/agent/context.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ use std::fmt::Write;

use futures::StreamExt as _;

use zeph_llm::provider::MessagePart;
use zeph_llm::provider::{MessageMetadata, MessagePart};
use zeph_memory::TokenCounter;
use zeph_skills::ScoredMatch;
use zeph_skills::loader::SkillMeta;
Expand Down Expand Up @@ -145,6 +145,7 @@ impl<C: Channel> Agent<C> {
role: Role::User,
content: prompt,
parts: vec![],
metadata: MessageMetadata::default(),
}])
.await
.map_err(Into::into);
Expand All @@ -160,6 +161,7 @@ impl<C: Channel> Agent<C> {
role: Role::User,
content: prompt,
parts: vec![],
metadata: MessageMetadata::default(),
}])
.await
}
Expand All @@ -185,6 +187,7 @@ impl<C: Channel> Agent<C> {
role: Role::User,
content: prompt,
parts: vec![],
metadata: MessageMetadata::default(),
}])
.await
.map_err(Into::into);
Expand All @@ -210,6 +213,7 @@ impl<C: Channel> Agent<C> {
role: Role::User,
content: consolidation_prompt,
parts: vec![],
metadata: MessageMetadata::default(),
}])
.await
.map_err(Into::into)
Expand All @@ -231,15 +235,16 @@ impl<C: Channel> Agent<C> {
let summary = self.summarize_messages(to_compact).await?;

let compacted_count = to_compact.len();
let summary_content =
format!("[conversation summary — {compacted_count} messages compacted]\n{summary}");
self.messages.drain(1..compact_end);
self.messages.insert(
1,
Message {
role: Role::System,
content: format!(
"[conversation summary — {compacted_count} messages compacted]\n{summary}"
),
content: summary_content.clone(),
parts: vec![],
metadata: MessageMetadata::agent_only(),
},
);

Expand All @@ -256,9 +261,43 @@ impl<C: Channel> Agent<C> {

if let (Some(memory), Some(cid)) =
(&self.memory_state.memory, self.memory_state.conversation_id)
&& let Err(e) = memory.store_session_summary(cid, &summary).await
{
tracing::warn!("failed to store session summary: {e:#}");
// Persist compaction: mark originals as user_only, insert summary as agent_only.
// Assumption: the system prompt is always the first (oldest) row for this conversation
// in SQLite — i.e., ids[0] corresponds to self.messages[0] (the system prompt).
// This holds for normal sessions but may not hold after cross-session restore if a
// non-system message was persisted first. MVP assumption; document if changed.
// oldest_message_ids returns ascending order; ids[1..=compacted_count] are the messages
// that were drained from self.messages[1..compact_end].
let sqlite = memory.sqlite();
let ids = sqlite
.oldest_message_ids(cid, u32::try_from(compacted_count + 1).unwrap_or(u32::MAX))
.await;
match ids {
Ok(ids) if ids.len() >= 2 => {
// ids[0] is the system prompt; compact ids[1..=compacted_count]
let start = ids[1];
let end = ids[compacted_count.min(ids.len() - 1)];
if let Err(e) = sqlite
.replace_conversation(cid, start..=end, "system", &summary_content)
.await
{
tracing::warn!("failed to persist compaction in sqlite: {e:#}");
}
}
Ok(_) => {
// Not enough messages in DB — fall back to legacy summary storage
if let Err(e) = memory.store_session_summary(cid, &summary).await {
tracing::warn!("failed to store session summary: {e:#}");
}
}
Err(e) => {
tracing::warn!("failed to get message ids for compaction: {e:#}");
if let Err(e) = memory.store_session_summary(cid, &summary).await {
tracing::warn!("failed to store session summary: {e:#}");
}
}
}
}

Ok(())
Expand Down Expand Up @@ -1104,6 +1143,7 @@ mod tests {
role: Role::User,
content: oversized_content.clone(),
parts: vec![],
metadata: MessageMetadata::default(),
}];
let chunks = chunk_messages(&messages, 4096, 2048, &tc);
assert_eq!(chunks.len(), 1);
Expand All @@ -1121,16 +1161,19 @@ mod tests {
role: Role::User,
content: half.clone(),
parts: vec![],
metadata: MessageMetadata::default(),
},
Message {
role: Role::User,
content: half.clone(),
parts: vec![],
metadata: MessageMetadata::default(),
},
Message {
role: Role::User,
content: half.clone(),
parts: vec![],
metadata: MessageMetadata::default(),
},
];
// budget = 2000 tokens: first two fit, third overflows → 2 chunks
Expand Down Expand Up @@ -1217,6 +1260,7 @@ mod tests {
role: Role::User,
content: format!("message {i} with some content to add tokens"),
parts: vec![],
metadata: MessageMetadata::default(),
});
}
assert!(!agent.should_compact());
Expand Down Expand Up @@ -1249,6 +1293,7 @@ mod tests {
role: Role::User,
content: format!("message number {i} with enough content to push over budget"),
parts: vec![],
metadata: MessageMetadata::default(),
});
}
assert!(agent.should_compact());
Expand All @@ -1275,6 +1320,7 @@ mod tests {
},
content: format!("message {i}"),
parts: vec![],
metadata: MessageMetadata::default(),
});
}

Expand Down Expand Up @@ -1306,11 +1352,13 @@ mod tests {
role: Role::User,
content: "msg1".to_string(),
parts: vec![],
metadata: MessageMetadata::default(),
});
agent.messages.push(Message {
role: Role::Assistant,
content: "msg2".to_string(),
parts: vec![],
metadata: MessageMetadata::default(),
});

let len_before = agent.messages.len();
Expand Down Expand Up @@ -1366,6 +1414,7 @@ mod tests {
role: Role::User,
content: format!("message {i}"),
parts: vec![],
metadata: MessageMetadata::default(),
});
}

Expand Down Expand Up @@ -1402,6 +1451,7 @@ mod tests {
role: Role::System,
content: format!("{RECALL_PREFIX}old recall data"),
parts: vec![],
metadata: MessageMetadata::default(),
},
);
assert_eq!(agent.messages.len(), 2);
Expand Down Expand Up @@ -1439,6 +1489,7 @@ mod tests {
role: Role::User,
content: format!("message {i}"),
parts: vec![],
metadata: MessageMetadata::default(),
});
}
assert_eq!(agent.messages.len(), 11);
Expand All @@ -1463,6 +1514,7 @@ mod tests {
role: Role::User,
content: format!("msg {i}"),
parts: vec![],
metadata: MessageMetadata::default(),
});
}

Expand All @@ -1486,6 +1538,7 @@ mod tests {
role: Role::User,
content: format!("message {i}"),
parts: vec![],
metadata: MessageMetadata::default(),
});
}
let msg_count = agent.messages.len();
Expand Down Expand Up @@ -1594,6 +1647,7 @@ mod tests {
role: Role::User,
content: "hello".into(),
parts: vec![],
metadata: MessageMetadata::default(),
});

agent.inject_summaries(1000).await.unwrap();
Expand Down Expand Up @@ -1628,12 +1682,14 @@ mod tests {
role: Role::System,
content: format!("{SUMMARY_PREFIX}old summary data"),
parts: vec![],
metadata: MessageMetadata::default(),
},
);
agent.messages.push(Message {
role: Role::User,
content: "hello".into(),
parts: vec![],
metadata: MessageMetadata::default(),
});
assert_eq!(agent.messages.len(), 3);

Expand Down Expand Up @@ -1673,6 +1729,7 @@ mod tests {
role: Role::User,
content: "hello".into(),
parts: vec![],
metadata: MessageMetadata::default(),
});

// Use a very small budget: only the prefix + maybe one short entry
Expand Down Expand Up @@ -1706,6 +1763,7 @@ mod tests {
role: Role::System,
content: format!("{SUMMARY_PREFIX}old summary"),
parts: vec![],
metadata: MessageMetadata::default(),
},
);
agent.messages.insert(
Expand All @@ -1714,6 +1772,7 @@ mod tests {
role: Role::System,
content: format!("{RECALL_PREFIX}recall data"),
parts: vec![],
metadata: MessageMetadata::default(),
},
);
assert_eq!(agent.messages.len(), 3);
Expand Down Expand Up @@ -1807,6 +1866,7 @@ mod tests {
role: Role::User,
content: format!("message {i} with enough content to push over budget threshold"),
parts: vec![],
metadata: MessageMetadata::default(),
});
}

Expand Down Expand Up @@ -1924,6 +1984,7 @@ mod tests {
role: Role::User,
content: format!("message {i}"),
parts: vec![],
metadata: MessageMetadata::default(),
});
}

Expand Down Expand Up @@ -2086,6 +2147,7 @@ mod tests {
role: Role::User,
content: "recent".into(),
parts: vec![],
metadata: MessageMetadata::default(),
});
}

Expand Down Expand Up @@ -2115,6 +2177,7 @@ mod tests {
role: Role::User,
content: "my key is sk-abc123xyz and lives at /Users/dev/config.toml".into(),
parts: vec![],
metadata: MessageMetadata::default(),
});

agent.prepare_context("test").await.unwrap();
Expand Down Expand Up @@ -2158,6 +2221,7 @@ mod tests {
role: Role::User,
content: original.clone(),
parts: vec![],
metadata: MessageMetadata::default(),
});

agent.prepare_context("test").await.unwrap();
Expand Down Expand Up @@ -2189,6 +2253,7 @@ mod tests {
role: Role::User,
content: format!("message {i} with content"),
parts: vec![],
metadata: MessageMetadata::default(),
});
}

Expand Down Expand Up @@ -2216,6 +2281,7 @@ mod tests {
"very long message content {i} repeated many times to fill context"
),
parts: vec![],
metadata: MessageMetadata::default(),
});
}
assert!(
Expand Down
7 changes: 6 additions & 1 deletion crates/zeph-core/src/agent/learning.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
use super::{Agent, Channel, LlmProvider};

use super::{LearningConfig, Message, Role, SemanticMemory};
use zeph_llm::provider::MessageMetadata;

use std::path::PathBuf;

Expand Down Expand Up @@ -81,6 +82,7 @@ impl<C: Channel> Agent<C> {
role: Role::User,
content: prompt,
parts: vec![],
metadata: MessageMetadata::default(),
});

let messages_before = self.messages.len();
Expand Down Expand Up @@ -110,7 +112,7 @@ impl<C: Channel> Agent<C> {
Ok(retry_succeeded)
}

#[allow(clippy::cast_precision_loss)]
#[allow(clippy::cast_precision_loss, clippy::too_many_lines)]
pub(super) async fn generate_improved_skill(
&self,
skill_name: &str,
Expand Down Expand Up @@ -168,6 +170,7 @@ impl<C: Channel> Agent<C> {
role: Role::User,
content: eval_prompt,
parts: vec![],
metadata: MessageMetadata::default(),
}];
match self
.provider
Expand Down Expand Up @@ -294,11 +297,13 @@ impl<C: Channel> Agent<C> {
"You are a skill improvement assistant. Output only the improved skill body."
.into(),
parts: vec![],
metadata: MessageMetadata::default(),
},
Message {
role: Role::User,
content: prompt,
parts: vec![],
metadata: MessageMetadata::default(),
},
];

Expand Down
Loading
Loading