A temporal knowledge graph framework with a fully local ML stack - no API keys required.
The core service is written in Go and provides an HTTP API for building and querying knowledge graphs. It can also be imported into other Go applications as a library. A Python client library is also provided for convenient integration with Python applications and data pipelines.
Most agentic memory libraries require external services (language models, vector databases, graph databases). Predicato has implemented embedded alternatives for all components so it can run without any external dependencies.
Predicato is modular by design - every component can run locally OR connect to external services. Start with the internal stack for development, then swap in cloud services for production without changing your code.
Predicato implements a two-layer architecture that separates raw fact extraction from graph modeling:
┌─────────────────────────────────────────────────────────────────┐
│ Episodes │
│ (documents, conversations, events) │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Entity Extraction │
│ (GLiNER for NER, NLP models for relationships) │
│ │ │
│ ┌─────────────────┴──────────────────┐ │
│ ▼ ▼ │
│ Standard Extraction Extended Extraction │
│ • Entities (nodes) • Contextual triples │
│ • Relationships (triples) • Conditional rules │
│ • Embeddings • Temporal/spatial context │
└─────────────────────────────────────────────────────────────────┘
│
┌───────────────┴───────────────┐
▼ ▼
┌─────────────────────────┐ ┌─────────────────────────────────┐
│ Fact Store │ │ GraphModeler │
│ (PostgreSQL/DoltGres) │ │ (pluggable interface) │
│ │ │ │
│ • Extracted nodes │ │ ┌───────────────────────────┐ │
│ • Knowledge triples │ │ │ • ResolveEntities │ │
│ • Conditional rules │ │ │ • ResolveRelationships │ │
│ • Source documents │ │ │ • BuildCommunities │ │
│ • Vector embeddings │ │ └───────────────────────────┘ │
│ │ │ │ │
│ ExtractOnly=true │ │ ▼ │
│ stops here ───────────►│ │ ┌───────────────────────────┐ │
│ │ │ │ Knowledge Graph │ │
│ ┌───────────────────┐ │ │ │ (CozoDB/DuckDB/Ladybug) │ │
│ │ RAG Search │ │ │ │ • Resolved entities │ │
│ │ (VectorChord/JSONB)│ │ │ │ • Temporal relationships │ │
│ └───────────────────┘ │ │ │ • Communities │ │
│ │ │ │ │ │
└─────────────────────────┘ │ └───────────────────────────┘ │
└─────────────────────────────────┘
Entity extraction is the expensive step — it requires LLM calls and embedding generation for every chunk of every document. By persisting the raw extraction results in the Fact Store, this work is done once and never repeated. The graph can then be built, torn down, and rebuilt with different parameters without re-extracting.
Fact Store (Layer 1) - Stores raw extractions exactly as they were found:
- Extracted nodes (entities) with types, descriptions, and embeddings
- Knowledge triples (subject-predicate-object) with contextual fields: condition, temporal, location, certainty, scope, source attribution
- Conditional rules (IF-THEN-UNLESS patterns) from extended extraction
- Preserves source provenance (which document, which chunk, which model)
- Uses PostgreSQL/VectorChord for production-grade hybrid search
Knowledge Graph (Layer 2) - Stores resolved, interconnected knowledge:
- Entity resolution merges duplicates ("Bob Smith" = "Robert Smith")
- Temporal modeling tracks when facts were valid vs. when recorded
- Community detection groups related entities
- Graph traversal finds multi-hop relationships
This separation enables:
- Multiple graph views - Generate different graph representations from the same extracted facts (different resolution thresholds, entity type filters, or custom
GraphModelerimplementations) without re-running extraction - Incremental updates - Re-process only changed documents
- Simpler RAG - Use
SearchFacts()when you don't need graph features - Audit trail - Track exactly what was extracted from each source
Every fact in Predicato has two time dimensions:
| Dimension | Field | Meaning |
|---|---|---|
| Transaction Time | created_at |
When the fact was recorded in the system |
| Valid Time | valid_from, valid_to |
When the fact was true in the real world |
This enables queries like:
- "What did we know about X as of last Tuesday?" (transaction time)
- "What was true about X during Q3 2024?" (valid time)
- "Show me facts that were recorded wrong and later corrected" (both)
Predicato supports two ingestion modes:
End-to-End Pipeline (default):
Episode -> Extract (nodes + triples) -> Resolve -> Graph
Decoupled Pipeline (with FactStore):
Episode -> Extract -> FactStore (ExtractOnly=true)
|
v (nodes, triples, rules persisted)
FactStore -> GraphModeler -> Graph
Extended Pipeline (with contextual enrichment):
Episode -> Extract -> Extended Extraction -> FactStore
(context + rules) |
v
FactStore -> GraphModeler -> Graph
The decoupled mode enables:
- Custom graph modeling: Implement
GraphModelerinterface to customize entity resolution, relationship handling, and community detection - Batch processing: Extract facts in bulk, then promote to graph on schedule
- Re-processing: Re-model the same facts with different parameters
- Validation: Test custom modelers before production use
When adding episodes, Predicato automatically:
- Extracts entities using GLiNER, GLInER2 (API), or NLP model prompts
- Generates embeddings for each entity
- Compares against existing entities (cosine similarity)
- Merges duplicates above a threshold (default: 0.85)
- Creates temporal edges between resolved entities
Every relationship extracted by Predicato is stored as a knowledge triple — a unified (subject, predicate, object) record with typed endpoints and optional context fields:
┌──────────────────────────────────────────────────────────────────┐
│ Subject: "Lisinopril" SubjectType: "Drug" │
│ Predicate: "treats" │
│ Object: "Hypertension" ObjectType: "Disease" │
│ │
│ Context: │
│ Condition: "when first-line therapy is appropriate" │
│ Temporal: "ongoing" │
│ Certainty: "established" │
│ Scope: "adults" │
│ Source: "StatPearls Hypertension chapter" │
│ Confidence: 0.95 │
└──────────────────────────────────────────────────────────────────┘
Standard extraction produces triples with subject, predicate, object, types, and descriptions from every chunk. Extended extraction (ExtendedExtraction: true) makes a second LLM pass to enrich triples with contextual fields and extract conditional rules:
client.Add(ctx, episodes, &predicato.AddEpisodeOptions{
ExtendedExtraction: true, // Enable contextual enrichment
})Extended extraction adds:
- Contextual fields on triples:
condition,temporal,location,certainty,scope,source_attribution - Conditional rules: IF-THEN-UNLESS patterns (e.g., "IF patient has hypertension THEN prescribe ACE inhibitor UNLESS contraindicated")
All fields are stored as flat columns in the fact store — no nested JSON. This makes them directly queryable via SQL and searchable via the hybrid search pipeline.
Override the default resolution logic by implementing GraphModeler:
type GraphModeler interface {
ResolveEntities(ctx, input) (*EntityResolutionOutput, error)
ResolveRelationships(ctx, input) (*RelationshipResolutionOutput, error)
BuildCommunities(ctx, input) (*CommunityOutput, error)
}
// Use with AddEpisodeOptions
client.AddEpisode(ctx, episode, &predicato.AddEpisodeOptions{
GraphModeler: myCustomModeler,
})
// Or set as default
client, _ := predicato.NewClient(db, llm, embedder, &predicato.Config{
DefaultGraphModeler: myCustomModeler,
})Validate custom modelers before use:
result, _ := client.ValidateModeler(ctx, myCustomModeler)
if !result.Valid {
log.Fatalf("Modeler validation failed: %v", result.EntityResolution.Error)
}| Component | Internal (No API) | External Services |
|---|---|---|
| Graph Database | CozoDB (embedded), DuckDB+DuckPGQ (embedded), Ladybug (embedded) | Neo4j, Memgraph |
| Embeddings | go-candle | OpenAI compatible APIs, AWS bedrock, Gemini |
| Reranking | go-candle | Jina, Cohere |
| Text Generation | go-candle (SmolLM2) | OpenAI compatible APIs |
| Entity Extraction | GLiNER (ONNX) | GLiNER2 (API) |
| Fact Storage | DoltGres (embedded) | PostgreSQL + VectorChord |
Why choose Predicato:
- Security - Don't expose your data to external services
- Run offline - Embedded database + local ML models = no network required
- Swap components freely - Same code works with local models or cloud APIs
- Bi-temporal knowledge - Track when facts were recorded AND when they were valid
- Hybrid search - Semantic + BM25 keyword + graph traversal in one query
- Production hardened - WAL recovery, circuit breakers, cost tracking, telemetry
No API keys. No external services. Just Go and CGO.
package main
import (
"context"
"log"
"time"
"github.com/soundprediction/predicato"
candleAdapter "github.com/soundprediction/predicato/pkg/candle"
"github.com/soundprediction/predicato/pkg/driver"
)
func main() {
ctx := context.Background()
// Embedded graph database — CozoDB (recommended), DuckDB, or Ladybug
db, _ := driver.NewCozoDriver("./knowledge.cozo", 1024)
defer db.Close()
// Local text generation (SmolLM2, no API)
candleClient, _ := candleAdapter.NewClient(&candleAdapter.CandleNLPConfig{
TextGenModelID: "HuggingFaceTB/SmolLM2-360M-Instruct",
})
llmClient := candleAdapter.NewLLMAdapter(candleClient, "text_generation")
defer candleClient.Close()
// Local embeddings (no API)
embedderClient, _ := candleAdapter.NewCandleEmbedderClient(&candleAdapter.CandleEmbedderConfig{
Model: "qwen/qwen3-embedding-0.6b",
Dimensions: 1024,
})
defer embedderClient.Close()
// Local reranking (no API) — uses embedder for cosine-similarity reranking
reranker := candleAdapter.NewCandleRerankerClient(embedderClient)
// Create client
client, _ := predicato.NewClient(db, llmClient, embedderClient, &predicato.Config{
GroupID: "my-app",
TimeZone: time.UTC,
}, nil)
defer client.Close(ctx)
// Add knowledge
client.Add(ctx, []predicato.Episode{{
ID: "meeting-1",
Name: "Team Standup",
Content: "Alice mentioned the API redesign is blocked on the auth team.",
Reference: time.Now(),
CreatedAt: time.Now(),
GroupID: "my-app",
}})
// Search with reranking
results, _ := client.Search(ctx, "API redesign status", nil)
// Rerank for better relevance
passages := make([]string, len(results.Nodes))
for i, node := range results.Nodes {
passages[i] = node.Summary
}
ranked, _ := reranker.Rank(ctx, "API redesign status", passages)
log.Printf("Top result: %s (score: %.2f)", ranked[0].Passage, ranked[0].Score)
}First run downloads models (~1.7GB total). Subsequent runs use cached models.
The same interfaces work with cloud services or different embedded backends - just swap the implementations:
package main
import (
"context"
"log"
"os"
"time"
"github.com/soundprediction/predicato"
"github.com/soundprediction/predicato/pkg/driver"
"github.com/soundprediction/predicato/pkg/embedder"
"github.com/soundprediction/predicato/pkg/nlp"
)
func main() {
ctx := context.Background()
// Choose your graph database:
// Option A: CozoDB embedded (recommended)
// db, _ := driver.NewCozoDriver("./knowledge.cozo", 1024)
// Option B: DuckDB + DuckPGQ embedded
// db, _ := driver.NewDuckPGQDriver("./knowledge.duckdb", 1024)
// Option C: Ladybug embedded (legacy)
// db, _ := driver.NewLadybugDriver("./knowledge.db", 1024)
// Option D: Neo4j (external)
db, _ := driver.NewNeo4jDriver(
os.Getenv("NEO4J_URI"),
os.Getenv("NEO4J_USER"),
os.Getenv("NEO4J_PASSWORD"),
)
defer db.Close(ctx)
// OpenAI for LLM and embeddings
apiKey := os.Getenv("OPENAI_API_KEY")
llmClient, _ := nlp.NewOpenAIClient(apiKey, nlp.Config{Model: "gpt-4o-mini"})
embedderClient := embedder.NewOpenAIEmbedder(apiKey, embedder.Config{
Model: "text-embedding-3-small",
})
client, _ := predicato.NewClient(db, llmClient, embedderClient, &predicato.Config{
GroupID: "my-app",
TimeZone: time.UTC,
}, nil)
defer client.Close(ctx)
// Same API as internal stack
client.Add(ctx, []predicato.Episode{{
ID: "meeting-1",
Content: "Alice mentioned the API redesign is blocked.",
// ...
}})
}go get github.com/soundprediction/predicatoPredicato supports three embedded graph databases. Each requires CGO and uses a build tag:
| Driver | Build Tag | Library |
|---|---|---|
| CozoDB (recommended) | system_cozo |
cozo-lib-go |
| DuckDB + DuckPGQ | system_duckpgq |
go-duckdb + duckpgq |
| Ladybug (legacy) | system_ladybug |
Custom embedded graph DB |
# Clone the repository
git clone https://github.com/soundprediction/predicato
cd predicato
# Build with CozoDB (recommended)
go build -tags system_cozo ./...
# Build with DuckDB + DuckPGQ
go build -tags system_duckpgq ./...
# Build with Ladybug (requires native library download)
go generate ./cmd/main.go
make build
# Run tests
make test
# Build CLI binary
make build-cli# Step 1: Download Ladybug library
go generate ./cmd/main.go
# Step 2: Build with CGO flags
export CGO_LDFLAGS="-L$(pwd)/cmd/lib-ladybug -Wl,-rpath,$(pwd)/cmd/lib-ladybug"
go build -tags system_ladybug ./...Many packages work without CGO dependencies:
# Build core packages (no CGO required)
go build ./pkg/factstore/...
go build ./pkg/embedder/...
go build ./pkg/nlp/...
# Run pure Go tests
make test-nocgoEmbedded graph databases (CozoDB, DuckDB, or Ladybug):
- Go 1.21+
- GCC (for CGO compilation)
- Make (recommended for Ladybug)
- ~4GB RAM for local models
External APIs only (no CGO needed):
- Go 1.21+
- API keys for your chosen providers
| Example | Description |
|---|---|
examples/basic/ |
Full internal stack - CozoDB + Candle + Reranking |
examples/chat/ |
Interactive chat with local models |
examples/external_apis/ |
Neo4j + OpenAI integration |
predicato/
├── pkg/driver/ # Graph databases (CozoDB, DuckDB+DuckPGQ, Ladybug, Neo4j, Memgraph)
├── pkg/candle/ # Local ML models via go-candle (embeddings, reranking, text gen, NER, translation)
├── pkg/embedder/ # Embedding providers (Candle, OpenAI, Gemini)
├── pkg/crossencoder/ # Reranking (Candle, Jina, LLM-based)
├── pkg/nlp/ # LLM clients (OpenAI-compatible APIs)
├── pkg/search/ # Hybrid search (semantic + BM25 + graph traversal)
├── pkg/factstore/ # Fact storage (PostgreSQL/DoltGres + VectorChord)
└── pkg/types/ # Core types (nodes, triples, edges, episodes)
We rely on Go bindings for ML models implemented in Rust. In particular we use go-candle (HuggingFace candle, pure Rust FFI) for embeddings, reranking, and text generation, and go-gline-rs for GLiNER NER. All models are pure Rust with no runtime dependencies (no libtorch, no ONNX Runtime). Predicato will automatically download models on first use and cache to ~/.cache/huggingface/.
Here is an example configuration and the model sizes involved:
| Component | Model | Download Size |
|---|---|---|
| Embeddings | qwen/qwen3-embedding-0.6b |
~600MB |
| Reranking | Embedding-based cosine similarity | (shares embedder) |
| Text Generation | SmolLM2-360M-Instruct | ~350MB |
Models download automatically on first use and cache to ~/.cache/huggingface/.
Temporal Knowledge Graph
- Bi-temporal model:
created_at(when recorded) vsvalid_from/valid_to(when true) - Automatic invalidation of contradicting facts
- Historical queries: "what did we know about X as of date Y?"
Hybrid Search
- Semantic similarity (cosine distance on embeddings)
- BM25 keyword matching
- Graph traversal (BFS expansion through relationships)
- 5 reranking strategies: RRF, MMR, cross-encoder, node distance, episode mentions
Production Ready
- Circuit breakers with provider fallback
- Token usage tracking and cost calculation
- Error telemetry with DB persistence
Predicato includes a fact storage system for extracted entities, knowledge triples, and conditional rules that can be used independently for RAG (Retrieval-Augmented Generation) without requiring graph queries.
The fact store persists:
extracted_nodes— entities with names, types, descriptions, and embeddingsextracted_triples— knowledge triples with subject/predicate/object, typed endpoints, contextual fields, and embeddingsextracted_rules— conditional rules (IF-THEN-UNLESS patterns)sources— original source documents with metadata
The fact storage uses PostgreSQL-compatible databases with VectorChord for native vector similarity search:
| Mode | Database | Use Case |
|---|---|---|
| Embedded | DoltGres | Development, single-node deployment |
| External | PostgreSQL + VectorChord | Production, managed databases (RDS, Cloud SQL) |
If no external PostgreSQL is configured, Predicato automatically uses DoltGres (embedded PostgreSQL-compatible database with git-like versioning).
// Option 1: Automatic embedded DoltGres (no config needed)
client, _ := predicato.NewClient(db, llm, embedder, &predicato.Config{
GroupID: "my-app",
}, nil)
// Option 2: External PostgreSQL with VectorChord
client, _ := predicato.NewClient(db, llm, embedder, &predicato.Config{
GroupID: "my-app",
FactStoreConfig: &factstore.FactStoreConfig{
Type: "postgres",
ConnectionString: "postgres://user:pass@localhost:5432/facts?sslmode=disable",
},
}, nil)For simpler RAG use cases that don't need relationship traversal:
// Search extracted facts directly (no graph queries)
results, _ := client.SearchFacts(ctx, "API design patterns", &types.SearchConfig{
Limit: 10,
MinScore: 0.7,
})
for _, node := range results.Nodes {
fmt.Printf("Found: %s (score: %.2f)\n", node.Name, node.Score)
}This performs hybrid search (vector similarity + keyword matching) using VectorChord and PostgreSQL full-text search.
# Build CLI
make build-cli
# Start HTTP server
./bin/predicato server --port 8080
# API endpoints
POST /api/v1/ingest/messages # Add content
POST /api/v1/ingest/extract # Extract entities (two-stage)
POST /api/v1/ingest/promote # Promote to graph (two-stage)
POST /api/v1/search # Search knowledge graph
POST /api/v1/search/facts # Search fact store (cosine similarity)
GET /api/v1/episodes/:id # Get episodesNote
GLiNER2 Support: Use the --gliner2 flag to enable the GLiNER2 entity extraction provider.
Predicato will automatically start the local GLiNER2 Python service (port 11435) if it is not already running.
This requires uv or python3 to be available in your path. Configured models (default fastino/gliner2-multi-v1) will be downloaded on first use.
Predicato includes an official Python client for interacting with the HTTP server.
pip install predicato
# or with uv
uv add predicatofrom predicato import PredicatoClient
# Connect to the server
with PredicatoClient(base_url="http://localhost:8080") as client:
# Add content
result = client.add_episode(
name="Team Meeting",
content="Alice mentioned the API redesign is blocked on auth.",
group_id="my-project",
source="meeting-notes",
)
# Search the knowledge graph
results = client.search(
query="API redesign status",
group_id="my-project",
limit=10,
)
for node in results.nodes:
print(f"- {node.name} ({node.entity_type})")For more control over entity extraction and graph building:
from predicato import PredicatoClient
with PredicatoClient(base_url="http://localhost:8080") as client:
# Stage 1: Extract entities and relationships
extraction = client.extract_to_facts(
name="Medical Article",
content="Hypertension is treated with ACE inhibitors like lisinopril...",
group_id="medical-kb",
entity_types={
"Disease": {"description": "A medical condition"},
"Drug": {"description": "A medication"},
},
)
print(f"Extracted {len(extraction.extracted_nodes)} entities")
# Inspect/modify extracted entities before promoting...
# Stage 2: Promote to graph with entity resolution
result = client.promote_to_graph(source_id=extraction.source_id)
print(f"Resolved {len(result.nodes)} entities in graph")import asyncio
from predicato import AsyncPredicatoClient
async def main():
async with AsyncPredicatoClient(base_url="http://localhost:8080") as client:
result = await client.add_episode(
name="Meeting Notes",
content="...",
group_id="my-project",
)
asyncio.run(main())Search the structured fact store directly using cosine similarity:
from predicato import PredicatoClient
with PredicatoClient(base_url="http://localhost:8080") as client:
# Search extracted entities with vector similarity
results = client.search_facts(
query="hypertension treatment options",
group_id="medical-kb",
limit=20,
min_score=0.5, # Cosine similarity threshold
)
# Results include similarity scores
for node, score in zip(results.nodes, results.node_scores):
print(f"- {node.name} ({node.type}): {score:.2f}")This is useful for RAG applications that need semantic search without full graph traversal.
See python/ for full documentation and python/examples/ for more examples including StatPearls medical article ingestion.
Apache 2.0
Inspired by Graphiti by Zep.