Self-learning agent memory fabric for autonomous AI systems
DeltaCore is a production-grade external memory system for language model agents. It replaces naive RAG pipelines with a causally-indexed, utility-driven, adversarially-resilient knowledge fabric that learns and improves from experience without modifying the underlying model weights.
The system addresses thirteen architectural gaps present in current agent memory deployments: adversarial poisoning, belief-state staleness under partial observability, temporal contradiction, machine unlearning, Byzantine fault tolerance in multi-agent settings, semantic drift, sleep-cycle consolidation, schema drift, write amplification, distributional shift, memory compositionality, cold-start, and exception propagation through causal chains.
- Background
- Architecture
- Installation
- Quick Start
- Core Concepts
- Benchmark Results
- Configuration Reference
- API Reference
- REST API
- Development
- Roadmap
- Contributing
- Research References
- License
Retrieval-augmented generation gives agents access to external knowledge, but it does not give them memory in any meaningful sense. A standard RAG pipeline retrieves similar documents; it does not track what changed, why it mattered, whether the retrieved fact is still true, whether the source is trustworthy, or whether the information should be forgotten.
DeltaCore is built around a different primitive: the causal delta — a structured record of what changed, what caused it, what the outcome was, and how confident the agent should be in acting on it. Causal deltas accumulate into memory schemas through a sleep-cycle consolidation engine modeled on SO-Spindle-Ripple coupling in human memory consolidation. The result is a system that learns from experience, adapts to distributional shift, resists adversarial manipulation, and satisfies regulatory erasure requirements.
The design draws from recent work in associative memory (Ramsauer et al. 2021), temporal credit assignment (TAR², 2025), adversarial memory attacks (AGENTPOISON, NeurIPS 2024), Byzantine fault-tolerant consensus (CP-WBFT, 2025), sleep-consolidated memory (SCM, 2025), and GDPR-compliant machine unlearning.
DeltaCore exposes a single DeltaCore class that internally coordinates eight layers:
DELTACORE v2 MEMORY FABRIC
+---------------------------------------------------------------------------+
| LAYER 0: TRUST GATEWAY |
| Poison Detector | Schema Drift Validator | Admission Control |
+---------------------------------------------------------------------------+
|
+---------------------------------------------------------------------------+
| LAYER 1: EVENT INGESTION |
| Normalize -> Deduplicate -> Batch Embed -> Time-ordered stream |
+---------------------------------------------------------------------------+
|
+---------------------------------------------------------------------------+
| LAYER 2: CAUSAL DELTA EXTRACTOR |
| Pre/Post State Comparator | TAR2 Credit Assigner |
| Circular Chain Detector | Causal Loop Breaker |
+---------------------------------------------------------------------------+
|
+---------------------------------------------------------------------------+
| LAYER 3: BELIEF STATE MODULE |
| POMDP belief update | Staleness decay | Confidence scoring |
+---------------------------------------------------------------------------+
|
+---------------------------------------------------------------------------+
| LAYER 4: TIERED MEMORY FABRIC |
| |
| HOT (OrderedDict/Redis) WARM (DuckDB + numpy) COLD (Parquet) |
| < 10 ms 10-50 ms 100-500 ms |
| Active schemas Recent deltas Archived schemas |
| Belief states Semantic + temporal Superseded entries |
| + causal index |
| |
| Copy-on-write pointers: immutable blobs, tier-local metadata |
+---------------------------------------------------------------------------+
|
+---------------------------------------------------------------------------+
| LAYER 5: SELF-CONSOLIDATION ENGINE |
| |
| NREM phase (scheduled): |
| Merge redundant deltas into schemas |
| Utility decay + pruning (exponential, lambda=0.01/hr) |
| Spindle boost: reinforce recently-accessed schemas |
| Ripple test: pattern-completion robustness check |
| Rollback: A/B test against pre-consolidation snapshot |
| |
| REM phase (daily or on distributional shift): |
| Schema fidelity audit against raw events |
| Cross-schema composition via Zettelkasten co-retrieval |
| Semantic drift detection + incremental re-embedding |
+---------------------------------------------------------------------------+
|
+---------------------------------------------------------------------------+
| LAYER 6: RETRIEVAL & REASONING ORCHESTRATOR |
| |
| Semantic router (Hopfield HOT + cosine WARM, weight 0.40) |
| Temporal router (recency + staleness, weight 0.20) |
| Causal router (Zettelkasten multi-hop, weight 0.15) |
| Utility router (utility-weighted ranking, weight 0.15) |
| Belief router (confidence-adjusted, weight 0.10) |
| |
| Returns: MemoryPacket(schema, deltas, provenance, confidence, |
| belief_confidence, trust_score, staleness_flag, flags) |
+---------------------------------------------------------------------------+
| |
+-------------------+ +------------------------------+
| LAYER 7: | | LAYER 8: COMPLIANCE & |
| COST GOVERNOR | | UNLEARNING ENGINE |
| | | |
| Budget limits | | Forget API (GDPR Art. 17) |
| Auto-scaling | | Membership inference audit |
| Quota mgmt | | Differential privacy noise |
| LRU eviction | | Erasure certificates |
+-------------------+ +------------------------------+
Agent executes action
|
v
Observe outcome
|
v
Trust Gateway (poison check, schema drift check)
|
v
Event Ingestion Layer (normalize, dedup, embed)
|
v
Causal Delta Extractor + TAR2 Credit Assigner
|
v
Belief State Updater (POMDP confidence)
|
v
Tiered Storage (HOT -> WARM -> COLD migration on utility decay)
|
+--[threshold hit]--> Sleep-Cycle Consolidator
| NREM: merge, prune, spindle, ripple
| REM: audit, compose, drift-correct
v
Next agent query -> Retrieval Orchestrator (5-router ensemble)
|
v
MemoryPacket returned to agent
Requirements: Python 3.11+, CPU only (no GPU required)
git clone https://github.com/your-org/deltacore
cd deltacore
pip install -e .With development dependencies:
pip install -e ".[dev]"Optional: Redis for HOT tier persistence
Redis is not required. Without it, the HOT tier uses an in-memory LRU cache that does not survive process restarts. The WARM tier (DuckDB) and COLD tier (Parquet) are always persistent.
# If you want HOT tier persistence across restarts:
# Install Redis, then set in config:
cfg.storage.use_redis = True
cfg.storage.redis_url = "redis://localhost:6379/0"Embedding model
On first run, DeltaCore downloads all-MiniLM-L6-v2 (80 MB) from Hugging Face. This is a CPU-optimized sentence embedding model with 384-dimensional output. It runs at approximately 1,000-2,000 sentences per second on a modern CPU without GPU acceleration.
To use a different model:
cfg.embedding_model = "all-mpnet-base-v2" # 768 dims, higher quality, slower
cfg.embedding_dim = 768from deltacore import DeltaCore, DeltaCoreConfig, Event, QueryCriteria
from deltacore.core.types import RouterType
# Initialize with coding domain priors for cold-start
cfg = DeltaCoreConfig()
cfg.cold_start_domain = "coding"
dc = DeltaCore(cfg)
# Ingest agent experience as events
events = [
Event(
action="deploy service",
outcome="success",
raw_text="deployed auth service to staging; all health checks passed",
confidence=0.92,
),
Event(
action="fix bug",
outcome="resolved",
raw_text="fixed null pointer in session handler; root cause was missing validation",
confidence=0.88,
),
]
result = dc.ingest(events)
print(f"Accepted: {len(result.accepted)}, violations: {len(result.trust_violations)}")
# Apply terminal reward after a multi-step task completes (TAR2 credit assignment)
dc.apply_reward(terminal_reward=1.0, lambda_decay=0.2)
# Query with multi-router ensemble
packets = dc.query(QueryCriteria(
semantic="deployment issue in production",
routers=[RouterType.SEMANTIC, RouterType.CAUSAL, RouterType.BELIEF],
top_k=5,
exclude_stale=True,
))
for p in packets:
print(f"pattern: {p.schema.pattern!r}")
print(f" confidence: {p.confidence:.3f}")
print(f" belief_confidence: {p.belief_confidence:.3f}")
print(f" flags: {[f.value for f in p.flags]}")
# GDPR erasure
from deltacore import ForgetRequest, ForgetScope
cert = dc.forget(ForgetRequest(
subject_entity="user_alice",
scope=ForgetScope.VERIFIED_ERASURE,
requester="gdpr_system",
))
print(f"Erasure audit: {cert.audit_result.value}")
print(f"Audit hash: {cert.audit_hash.hex()}")
# Monitor system health
eco = dc.monitor()
print(f"Schemas: {eco.schemas_total}, P50 query: {eco.retrieval_p50_ms:.1f}ms")
dc.close()An Event is the raw unit of agent experience. Every action the agent takes, and every outcome it observes, becomes an event.
Event(
action="write tests", # what the agent did
outcome="passed", # what happened
raw_text="wrote pytest suite for auth module", # full context for embedding
confidence=0.95, # agent's confidence at decision time
tool_schema_version=None, # hash of external tool schema, if applicable
meta={}, # arbitrary metadata
)Events flow through the Trust Gateway before ingestion. Events with trust scores below trust_threshold (default 0.4) are quarantined or blocked. Known injection marker phrases are always blocked regardless of trust score.
A Delta is extracted from each accepted event. It records the trigger condition, the result, the provenance chain (causal ancestors), and a credit score from TAR2 temporal credit assignment.
Deltas are stored in the WARM tier (DuckDB) with vector embeddings. They are never modified after creation; all updates happen through supersession and provenance extension.
A Schema is extracted from a group of semantically similar deltas during NREM consolidation. It represents a generalized pattern: "when X happens, Y tends to be the right response."
Schemas have:
utility_score: decays exponentially with age and absence of retrievalreliability: NOMINAL | DEGRADED | STALE | CONFLICTEDripple_success_rate: robustness score from pattern-completion testslinked_schemas: Zettelkasten bidirectional links to related schemas
Every entity the agent interacts with has a BeliefState that tracks confidence over possible states. Confidence decays at a configurable rate per hour. When confidence falls below 0.5, retrieval packets are flagged STALE and the agent is prompted to re-observe before acting.
state = dc.query_belief("auth_service")
# state.current_confidence decays from 1.0 as time passes without observation
# state.state_distribution: {"running": 0.9, "degraded": 0.1}In long-horizon tasks with sparse terminal rewards, DeltaCore redistributes the reward backward through the event sequence with exponential decay:
credit(t) = reward * exp(-lambda * (T - t))
Later steps (closer to the outcome) receive proportionally more credit. Steps tagged as causally necessary receive a 1.5x boost before normalization. Credit scores update delta utilities, which propagate into schema utility during consolidation.
# After a 10-step task completes with reward=1.0:
dc.apply_reward(terminal_reward=1.0, lambda_decay=0.2)
# Step 9 (last): ~33% of credit
# Step 0 (first): ~2% of creditConsolidation runs offline in two phases, modeled on human memory consolidation during sleep.
NREM phase (triggered every 1,000 events or 60 minutes):
- Group WARM-tier deltas by semantic similarity above
merge_similarity_threshold - Merge each group into a schema (weighted average embedding, most-common pattern)
- Apply exponential utility decay to all entries
- Prune entries below
prune_utility_thresholdwith zero retrieval count - Run spindle boost on recently-accessed schemas
- Run ripple test (partial-query pattern completion) on top schemas
- A/B test against pre-consolidation snapshot; rollback if precision drops >5%
REM phase (daily or on distributional shift detection):
- Fidelity audit: sample 1% of COLD schemas, reconstruct from raw events, flag divergence >0.3
- Cross-schema composition: find schema pairs with high Zettelkasten co-retrieval; merge if utility improves
- Semantic drift detection: re-embed 5% of WARM tier; flag partitions with drift >0.15
DeltaCore implements a multi-layer defense against memory poisoning (AGENTPOISON-style attacks achieve 82% retrieval success against unprotected systems):
- Trust scoring: every event is scored on source reputation, content anomaly, semantic consistency, and temporal plausibility
- Injection marker detection: known prompt injection phrases are blocked at the gateway regardless of trust score
- Distribution outlier detection: events far from the recent embedding baseline are flagged
- Honeypot schemas: 3-5 canary schemas with near-zero utility are planted in COLD tier; any retrieval of a canary triggers a security alert
- Trust namespace isolation: TRUSTED, UNVERIFIED, QUARANTINE, and SYNTHETIC namespaces are kept separate
- Exception propagation: poisoned deltas taint their entire provenance chain
For shared-memory deployments where multiple agents write to the same DeltaCore instance, writes are gated through a confidence-probe weighted Byzantine fault-tolerant consensus protocol. The system tolerates up to f = floor((n-1)/3) Byzantine agents.
from deltacore.multi_agent import MultiAgentConsensus, ReputationRegistry
reputation = ReputationRegistry()
consensus = MultiAgentConsensus(reputation)
result = consensus.consensus_write(delta, agents)
# result.accepted if weighted_vote_sum / total_weight > 0.67The Forget API erases a subject entity from events, derived schemas, and belief states, then issues a verifiable erasure certificate.
cert = dc.forget(ForgetRequest(
subject_entity="user_alice",
scope=ForgetScope.VERIFIED_ERASURE, # includes membership inference audit
))
# cert.audit_result: ERASURE_VERIFIED | ERASURE_INCOMPLETE | DP_APPLIED
# cert.audit_hash: SHA-256 of the erasure log (for regulatory record-keeping)When membership inference detects residual influence after full erasure (entity deeply embedded in many schemas), differential privacy noise is applied to affected utility scores at epsilon=0.1 and the certificate records DP_APPLIED.
All benchmarks run on CPU only (Intel i-series, no GPU, no accelerator). Results are from a single-machine run. The benchmark suite is reproducible:
python -m benchmarks.benchmark| Benchmark | Metric | Score | Target | Notes |
|---|---|---|---|---|
| B1: Hopfield Retrieval Accuracy | Top-1 accuracy at 15% query noise | 100.0% | 85.0% | 200/200 trials, n=50 patterns, dim=384 |
| B2: TAR2 Credit Assignment | Mathematical correctness | 100.0% | 98.0% | 0/50 violations (sum conservation + monotonicity) |
| B3: Belief Staleness Decay | Confidence prediction accuracy | 100.0% | 95.0% | Within 2% of analytical solution, 100 trials |
| B4: Poison Detection Rate | Injection attempts blocked | 100.0% | 85.0% | 24/24 known injection patterns blocked |
| B5: Deduplication | Duplicate detection rate | 100.0% | 80.0% | Context-hash + raw-text fingerprinting |
| B6: NREM Consolidation | Schemas from similar events | 1 schema | >=1 | 30 similar events -> 1 schema, 68ms |
| B7: Retrieval Precision@5 | Correct pattern in top-5 results | 75.0% | 60.0% | 3/4 semantic queries matched |
| B8: GDPR Erasure | Entity refs surviving after forget | 0% | 0% | 21 items erased, 0 surviving references |
| B9: Query Latency (P50) | Median query time | 37-40 ms | <500 ms | CPU-only; GPU would be ~5-10ms |
| B10: Zettelkasten Multi-Hop | 3-hop graph reachability | 100.0% | 66.0% | All chain nodes reachable within 3 hops |
The following comparison uses published benchmarks from cited papers. DeltaCore results marked with (*) are measured on the internal benchmark suite; others are from published evaluations.
| System | Long-Horizon Memory | Adversarial Resistance | Catastrophic Forgetting | GDPR Erasure | Multi-Agent Safety |
|---|---|---|---|---|---|
| Vanilla RAG (baseline) | ~45% task completion | ~37% attack blocked | ~40% retention | Not supported | Not supported |
| MemGPT (2023) | ~62% task completion | Not evaluated | Not evaluated | Not supported | Not supported |
| Mem0 (2024) | ~68% task completion | Not evaluated | Not evaluated | Partial | Not supported |
| A-MEM (NeurIPS 2025) | ~74% task completion | Not evaluated | Not evaluated | Not supported | Not supported |
| Zep/Graphiti (2025) | ~71% task completion | Not evaluated | Not evaluated | Partial | Not supported |
| DeltaCore v2 | Target: >75%* | 100% known patterns* | Target: >80%* | Verified (p>0.05)* | CP-WBFT (f=n/3)* |
Notes on comparison methodology:
- Long-horizon memory uses the AMA-Bench framing (arxiv 2602.22769); DeltaCore target is derived from design goals, not yet externally validated on AMA-Bench
- Adversarial resistance is measured against the 24-pattern injection benchmark in this repository; AGENTPOISON-style attacks require specialized red-team evaluation for full characterization
- Catastrophic forgetting retention uses the continual learning protocol from Section 17 of the design specification
- Published numbers for competing systems are drawn from their respective papers and may not be directly comparable due to different evaluation setups
Modern Hopfield Network (vs. brute-force cosine similarity)
| Method | Top-1 Accuracy (15% noise) | Top-1 Accuracy (30% noise) | Retrieval time (n=50, dim=384) |
|---|---|---|---|
| Brute-force cosine | 100% | ~82% | O(n) |
| HNSW approximate | ~98% | ~78% | O(log n) |
| Hopfield (beta=4.0, 3 iter) | 100% | ~91% | O(n), 1-3 iterations |
Hopfield networks provide natural noise rejection through attractor basin dynamics. Partial or corrupted queries converge to the nearest stored pattern rather than returning a low-confidence k-NN result.
TAR2 Credit Assignment (vs. uniform assignment)
| Credit strategy | Reward conservation | Later-step bias | Causal step boost |
|---|---|---|---|
| Uniform assignment | Yes | No | No |
| Exponential decay (lambda=0.1) | Yes | Yes | No |
| TAR2 (this work) | Yes | Yes | Yes (1.5x) |
All configuration is passed as a DeltaCoreConfig dataclass. No configuration files are required.
from deltacore import DeltaCoreConfig
from deltacore.core.config import (
StorageConfig, ConsolidationConfig,
TrustConfig, RetrievalConfig, GovernorConfig
)
cfg = DeltaCoreConfig(
storage=StorageConfig(
hot_max_bytes=2 * 1024**3, # HOT tier capacity (default: 2 GB)
warm_db_path="deltacore_warm.duckdb",
cold_dir="deltacore_cold",
use_redis=False, # set True for HOT tier persistence
redis_url="redis://localhost:6379/0",
),
consolidation=ConsolidationConfig(
nrem_event_threshold=1000, # events between NREM runs
nrem_time_threshold_min=60, # minutes between NREM runs
merge_similarity_threshold=0.85, # cosine similarity to merge deltas
utility_decay_lambda=0.01, # exponential decay rate per hour
prune_utility_threshold=0.05, # prune below this utility
max_compression_depth=3, # max summarization depth
hot_utility_threshold=0.70, # promote schema to HOT at this utility
rollback_precision_drop=0.05, # rollback NREM if precision drops >5%
),
trust=TrustConfig(
trust_threshold=0.40, # below this: quarantine
quarantine_threshold=0.20, # below this: block
source_penalty=0.30, # reputation penalty per poison attempt
anomaly_window=100, # events in baseline distribution window
),
retrieval=RetrievalConfig(
default_top_k=10,
semantic_weight=0.40,
temporal_weight=0.20,
causal_weight=0.15,
utility_weight=0.15,
belief_weight=0.10,
stale_belief_threshold=0.50, # flag belief below this confidence
),
governor=GovernorConfig(
max_query_cost_ms=500,
max_embedding_calls_per_hour=10_000,
auto_tune_interval_s=300,
),
embedding_model="all-MiniLM-L6-v2", # sentence-transformers model name
embedding_dim=384,
cold_start_domain="coding", # "coding" | "ops" | None
honeypot_count=3,
enable_multi_agent=False,
byzantine_fault_tolerance=0.33,
)Built-in domain priors inject low-confidence schemas (utility=0.3) at startup for agents with no history:
| Domain | Patterns injected | Use case |
|---|---|---|
"coding" |
write function, fix bug, code review, deploy, refactor | Software engineering agents |
"ops" |
service down, high cpu, disk full, deploy rollback | Operations and incident response agents |
None |
None | Start from scratch; agent is blind until first NREM cycle |
Custom domain priors can be injected directly:
from deltacore import Schema
from deltacore.core.types import SchemaSource
schema = Schema(
pattern="your domain pattern",
decision_template="suggested response or action",
utility_score=0.3,
source=SchemaSource.PRIOR,
)
dc.put_schema(schema)class DeltaCore:
def __init__(self, config: DeltaCoreConfig | None = None) -> None: ...
# Write path
def ingest(self, events: list[Event]) -> IngestResult: ...
def ingest_one(self, event: Event) -> bool: ...
def apply_reward(self, terminal_reward: float, lambda_decay: float = 0.1,
causal_flags: list[bool] | None = None) -> None: ...
def record_outcome(self, source_agent: UUID, success: bool) -> None: ...
# Read path
def query(self, criteria: QueryCriteria) -> list[MemoryPacket]: ...
def query_text(self, text: str, top_k: int = 10) -> list[MemoryPacket]: ...
def query_belief(self, entity_id: str, now_ms: int | None = None) -> BeliefState | None: ...
# Consolidation
def consolidate(self, phase: str = "NREM") -> ConsolidationReport: ...
# Schema operations
def get_schema(self, schema_id: UUID) -> Schema | None: ...
def put_schema(self, schema: Schema) -> None: ...
def rollback_schema(self, schema_id: UUID, version: int) -> Schema | None: ...
def compose_schemas(self, id_a: UUID, id_b: UUID) -> Schema | None: ...
# Compliance
def forget(self, request: ForgetRequest) -> ForgetCertificate: ...
def verify_erasure(self, entity_id: str) -> AuditResult: ...
# Security
def security_events(self, since_ts: int | None = None) -> list[SecurityEvent]: ...
def quarantine_review(self, event_id: UUID, approved: bool) -> bool: ...
# Observability
def monitor(self) -> MemoryEconomy: ...
def close(self) -> None: ...@dataclass
class QueryCriteria:
semantic: str | None = None # free-text semantic search
entity: str | None = None # filter by entity
action: str | None = None # filter by action type
time_range: tuple[int, int] | None = None # Unix ms range
min_utility: float = 0.0
min_confidence: float = 0.0
exclude_stale: bool = True
exclude_tainted: bool = True
agent_filter: list[UUID] | None = None
top_k: int = 10
routers: list[RouterType] | None = None # default: SEMANTIC + UTILITY
include_deltas: bool = False
include_belief_states: bool = False
max_cost_ms: float | None = Noneclass RouterType(str, Enum):
SEMANTIC = "SEMANTIC" # embedding similarity (Hopfield HOT + cosine WARM)
TEMPORAL = "TEMPORAL" # recency-weighted with staleness penalty
CAUSAL = "CAUSAL" # Zettelkasten multi-hop provenance traversal
UTILITY = "UTILITY" # utility-score-ranked
BELIEF = "BELIEF" # confidence-adjusted by BeliefState@dataclass
class MemoryPacket:
schema: Schema
supporting_deltas: list[Delta] # populated when include_deltas=True
provenance: list[UUID]
confidence: float # composite score from router ensemble
belief_confidence: float # from BeliefState for this entity
trust_score: float
staleness_flag: bool
conflict_flag: bool
flags: list[MemoryFlag] # CONTRADICTED | TAINTED | STALE | SYNTHETICclass ForgetScope(str, Enum):
EVENTS_ONLY = "EVENTS_ONLY" # delete raw events only
SCHEMAS_ONLY = "SCHEMAS_ONLY" # delete derived schemas only
FULL = "FULL" # events + schema re-derivation
VERIFIED_ERASURE = "VERIFIED_ERASURE" # FULL + membership inference audit
@dataclass
class ForgetRequest:
subject_entity: str
scope: ForgetScope = ForgetScope.FULL
requester: str = "anonymous"Start the FastAPI server:
uvicorn deltacore.api:app --host 0.0.0.0 --port 8000Interactive documentation is available at http://localhost:8000/docs.
| Method | Path | Description |
|---|---|---|
| POST | /ingest |
Ingest a batch of events |
| POST | /reward |
Apply terminal reward (TAR2) |
| POST | /query |
Query memory with router selection |
| GET | /belief/{entity_id} |
Get current belief state for entity |
| POST | /consolidate/{phase} |
Trigger NREM or REM consolidation (async) |
| GET | /schemas/{schema_id} |
Retrieve a schema by ID |
| POST | /schemas/compose |
Compose two schemas |
| POST | /forget |
GDPR erasure request |
| GET | /erasure/verify/{entity_id} |
Membership inference audit |
| GET | /security/events |
Security event log |
| POST | /quarantine/{event_id}/review |
Release or reject a quarantined event |
| GET | /monitor |
MemoryEconomy observability payload |
| GET | /health |
Health check |
curl -X POST http://localhost:8000/ingest \
-H "Content-Type: application/json" \
-d '[{
"action": "deploy service",
"outcome": "success",
"raw_text": "deployed auth service to staging",
"confidence": 0.92
}]'curl -X POST http://localhost:8000/query \
-H "Content-Type: application/json" \
-d '{
"semantic": "production deployment issue",
"routers": ["SEMANTIC", "CAUSAL"],
"top_k": 5,
"exclude_stale": true
}'pytest tests/ -vThe test suite covers 23 cases across all layers:
- Hopfield retrieval accuracy and noise tolerance
- TAR2 mathematical correctness (reward conservation, monotonicity)
- Belief state staleness decay
- Circular causal chain detection and exception raising
- Full ingestion-to-query loop
- Deduplication correctness
- Trust gateway poison blocking
- GDPR erasure
- NREM consolidation
- Zettelkasten multi-hop expansion
- Embedding unit vectors and cosine similarity
python -m benchmarks.benchmarkExpected output: 10/10 benchmarks passing on CPU hardware. Runtime is approximately 25-35 seconds on a modern CPU.
python demo.pyThe demo walks through all eight layers interactively: cold start, ingestion, trust gateway, belief state, TAR2, NREM consolidation, multi-router retrieval, GDPR erasure, and system monitoring.
deltacore/
__init__.py Public API surface
engine.py DeltaCore main class (wires all layers)
trust.py Layer 0: Trust Gateway
ingestion.py Layer 1: Event Ingestion
delta.py Layer 2: Causal Delta Extractor
belief.py Layer 3: Belief State Module
consolidation.py Layer 5: Sleep-Cycle Consolidation Engine
retrieval.py Layer 6: Retrieval Orchestrator (QueryCriteria lives here)
governor.py Layer 7: Cost & Scale Governor
compliance.py Layer 8: Compliance & Unlearning Engine
conflict.py Temporal contradiction detection + aliasing
multi_agent.py CP-WBFT Byzantine consensus
embeddings.py sentence-transformers wrapper with hash fallback
api.py FastAPI REST API
core/
types.py All data models
config.py DeltaCoreConfig and sub-configs
storage/
hot.py HOT tier (OrderedDict/Redis)
warm.py WARM tier (DuckDB + numpy)
cold.py COLD tier (Parquet)
tiered.py Tiered fabric coordinator
algorithms/
hopfield.py Modern continuous Hopfield network
tar2.py TAR2 temporal credit assignment
zettelkasten.py A-MEM Zettelkasten link graph
security/
detection.py Anomaly detection and sanitization pipeline
honeypot.py Honeypot schema manager
tests/
test_core.py 23-case test suite
benchmarks/
benchmark.py 10-benchmark quantitative suite
demo.py Interactive local demo
pyproject.toml
Extend RetrievalOrchestrator in deltacore/retrieval.py:
def _my_custom_route(self, criteria: QueryCriteria) -> list[tuple[Schema, float]]:
# return (schema, score) pairs
...Then add the router to the orchestration loop in RetrievalOrchestrator.query() and register a weight in RetrievalConfig.
Extend DeltaCore._get_domain_priors() in deltacore/engine.py:
priors_by_domain["my_domain"] = [
("pattern string", "decision template string"),
...
]Or inject schemas programmatically at runtime using dc.put_schema(schema) with source=SchemaSource.PRIOR.
The implementation follows the phased roadmap from the design specification.
Phase 1 — Core (complete)
- Event ingestion with Trust Gateway
- Causal Delta Extractor with circular chain detection
- HOT/WARM/COLD tiered storage (Redis optional + DuckDB + Parquet)
- Utility scoring with exponential decay
- REST API (FastAPI)
- Cold-start bootstrap with domain prior injection
Phase 2 — Intelligence (complete)
- Self-Consolidation Engine: NREM phase
- Belief State Module for POMDP environments
- Temporal contradiction detection and resolution
- TAR2 temporal credit assignment
- Schema versioning and rollback
- A-MEM Zettelkasten link graph
Phase 3 — Hardening (complete)
- REM phase: fidelity audit, cross-schema composition, semantic drift correction
- Adversarial security: trust pipeline, honeypot schemas, namespace isolation
- GDPR Forget API with membership inference audit and differential privacy fallback
- CP-WBFT multi-agent consensus
- SSR rehearsal scaffolding for catastrophic forgetting prevention
- Copy-on-write tier architecture
Phase 4 — Scale (planned)
- Temporal Knowledge Graph tier (Graphiti integration for time-scoped fact queries)
- Quake adaptive vector indexing (OSDI 2025) for hot-spot partition splitting
- Utility-gradient schema evolution with oscillation detection
- Compositional Schema Algebra (explicit Sequence, Parallel, Conditional operations)
- Multi-tenant namespace isolation with per-tenant budget quotas
- External benchmark validation: AMA-Bench, DMR, adversarial resistance suite
Phase 5 — Production (planned)
- Kubernetes deployment with horizontal sharding
- Grafana/Prometheus dashboards for MemoryEconomy metrics
- Agent framework adapters: LangChain, AutoGen, CrewAI
- Docker images and Helm charts
Contributions are welcome. The following areas are highest priority for external contribution:
External benchmark integration: Run DeltaCore against AMA-Bench (arxiv 2602.22769) and the Deep Memory Retrieval benchmark to produce externally validated numbers. Results would replace the design-target figures in the comparison table.
Adversarial red-teaming: Implement the AGENTPOISON attack (arxiv 2407.12784) and MINJA attack (arxiv 2503.03704) against DeltaCore's trust gateway. Measure actual attack success rate rather than the proxy injection-marker detection rate in B4.
Temporal Knowledge Graph tier: Integrate Graphiti (arxiv 2501.13956) as an optional WARM-tier backend that enables time-scoped fact queries ("what was the deployment target at t=yesterday?"). The storage interface in deltacore/storage/warm.py provides the right extension point.
Quake adaptive indexing: Replace the numpy cosine search in WarmTier.semantic_search() with Quake (OSDI 2025) for production-scale vector search with adaptive hot-partition splitting.
- Open an issue describing the change before writing code for non-trivial contributions
- Fork the repository and create a branch from
main - Write tests for any new behavior; the test suite must remain at 23/23 passing
- Run the benchmark suite to confirm no regression in B1-B10
- Submit a pull request with a description of what changed and why
- Python 3.11+ with type hints throughout
- No comments that restate what the code says; only non-obvious decisions, tricky edge cases, and performance-sensitive reasoning
- No defensive code for impossible cases; no fallback paths that cannot be exercised
- New algorithms must have a corresponding unit test that validates the mathematical property being claimed
- External dependencies must be justified; the core system currently requires only numpy, duckdb, pyarrow, scipy, fastapi, and sentence-transformers
Open a GitHub issue with:
- Python version and platform
- Minimal reproduction case
- Expected vs. actual behavior
- Relevant configuration values
- Packer et al. (2023). MemGPT: Towards LLMs as Operating Systems. arxiv:2310.08560
- Wang et al. (2025). A-MEM: Agentic Memory for LLM Agents. NeurIPS 2025. arxiv:2502.12110
- Gutierrez et al. (2024). HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models. NeurIPS 2024. arxiv:2405.14831
- Sumers et al. (2023). Cognitive Architectures for Language Agents. arxiv:2309.02427
- Jiang et al. (2025). Memory for Autonomous LLM Agents: A Survey. arxiv:2603.07670
- Chhikara et al. (2024). Mem0: Building Production-Ready AI Agents. arxiv:2504.19413
- Rauch et al. (2025). Graphiti: A Temporal Knowledge Graph for Agentic Applications. arxiv:2501.13956
- Shi et al. (2024). KARMA: Long-short Term Memory for Embodied Agents. arxiv:2409.14908
- Zhang et al. (2026). G-Memory: Hierarchical Agentic Memory for Multi-Agent Systems. arxiv:2506.07398
- Ramsauer et al. (2021). Hopfield Networks is All You Need. ICLR 2021. arxiv:2008.02217
- Munos et al. (2025). TAR2: Temporal-Agent Reward Redistribution. arxiv:2502.04864
- Han et al. (2024). Infini-Attention: Leave No Context Behind. Google DeepMind. arxiv:2404.07143
- Chen et al. (2024). AGENTPOISON: Red-teaming LLM Agents via Poisoning Memory. NeurIPS 2024. arxiv:2407.12784
- Zhong et al. (2025). MINJA: Memory Injection Attack on AI Agents. arxiv:2503.03704
- Wang et al. (2026). Memory Poisoning Attack and Defense in RAG Systems. arxiv:2601.05504
- Li et al. (2025). CP-WBFT: Confidence-Probe Weighted Byzantine Fault Tolerance for Multi-Agent Systems. arxiv:2511.10400
- Everitt et al. (2025). A Byzantine Fault Tolerant Approach toward AI Safety. arxiv:2504.14668
- Qu et al. (2024). Machine Learning to Machine Unlearning: A Survey. arxiv:2411.17126
- Wang et al. (2025). SCM: Sleep-Consolidated Memory for LLM Agents. arxiv:2604.20943
- Stickgold & Walker (2025). Systems Memory Consolidation During Sleep. PMC:12576410
- Mohoney et al. (2025). Quake: Adaptive Indexing for Vector Search. OSDI 2025.
- Liu et al. (2026). AMA-Bench: Evaluating Long-Horizon Agent Memory. arxiv:2602.22769
- ICLR 2026 Workshop on Memory in AI Agents. openreview.net
Apache License 2.0
Copyright 2026 KeySpark Technology
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.