Memory Engine

A semantic knowledge management system that combines graph-based knowledge representation with vector embeddings for information storage, retrieval, and synthesis.

🌟 Overview

Memory Engine is an experimental knowledge management system that transforms unstructured text into a structured, searchable knowledge graph. It combines graph databases with vector embeddings to create a foundation for applications that can understand, connect, and reason about information.

⚠️ Important Notice

This is a personal open-source project developed for learning and research purposes. No guarantees are made regarding reliability, security, or suitability for production use. Use at your own risk.

🚧 Project Status

This project is currently in active development (v0.5.0 - Orchestrator Integration) and should be considered experimental.

Vision

Our goal is to create a truly open and accessible knowledge management system that works with:

Any AI model: Commercial APIs (OpenAI, Anthropic, Google) and local models (Ollama, Hugging Face)
Any deployment: From laptop development to distributed production systems
Any data: Text, documents, structured data, and multimedia content

We aim to eliminate dependency on paid APIs by providing full support for local model execution, making advanced knowledge management accessible to everyone.

🎯 What Memory Engine Does

Input: Unstructured text, documents, or data

Output: Structured knowledge with automatic relationships and semantic search capabilities

Core Functions

Knowledge Ingestion: Feed text/documents → Engine extracts entities, facts, and relationships → Stores in graph database
Knowledge Retrieval: Query in natural language → Engine searches semantically → Returns relevant information with context
Automatic Processing: The engine handles complexity internally - relationship discovery, quality assessment, versioning, and optimization

Key Features

🧠 AI & Language Models

Multi-LLM Support: 5 different LLM providers (Gemini, OpenAI, Anthropic, Ollama, HuggingFace)
LLM Independence: Fallback chains and circuit breaker pattern for resilience
Local Operation: Complete offline capabilities with Ollama and HuggingFace Transformers
Automatic Relationship Discovery: Detects and creates relationships between knowledge entities

⚡ Performance & Production

Advanced Caching: Multi-level caching with TTL, memory limits, and intelligent invalidation
Connection Pooling: Health monitoring and configurable pool management
Query Optimization: Prepared statements and batch processing for high throughput
Memory Management: Garbage collection optimization and automatic resource cleanup

🛠️ Operations & Management

Health Monitoring: Comprehensive system health checks and service monitoring
CLI Tools: Complete command-line interface for all management operations
Migration Tools: Backend migration utilities with multiple strategies
Backup & Restore: Automated backup with compression and retention policies

🔌 Extensibility

Plugin Architecture: Custom storage backends, LLM providers, and embedding providers
Data Export/Import: Multiple formats (JSON, CSV, XML, GraphML, Cypher, Gremlin, RDF)
Metrics Collection: Prometheus-compatible metrics with counters, gauges, histograms

🔍 Knowledge Management

Semantic Search: Multi-provider vector embeddings with modular vector stores
Modular Storage: Choose from JanusGraph, SQLite, or JSON backends
Quality Enhancement: Automated quality assessment and contradiction resolution
Version Control: Complete change tracking and rollback capabilities

🔐 Security & Integration

Basic Security Features: Authentication, RBAC, encryption, and audit logging (educational purposes)
Privacy Controls: Fine-grained knowledge privacy levels and access control
Flexible Integration: MCP (Module Communication Protocol) interface for external systems
Agent Support: Google ADK integration for conversational knowledge interactions

🚀 Quick Start

Prerequisites

Python 3.8+
Docker & Docker Compose (optional, for JanusGraph/Milvus)
At least one LLM provider API key:
- Google Gemini API key (Get one here)
- OpenAI API key (Get one here)
- Anthropic API key (Get one here)
- Or use local models with Ollama or HuggingFace (no API key needed)

1. Installation

# Clone the repository
git clone https://github.com/Celebr4tion/memory-engine.git
cd memory-engine

# Run automated setup
./scripts/setup.sh

The setup script will:

Check Python version compatibility
Create virtual environment
Install dependencies
Create configuration template
Set up development tools

2. Environment Setup

# Edit the .env file created by setup
# Set your preferred LLM provider API keys (at least one required)
GOOGLE_API_KEY="your-gemini-api-key"           # For Gemini
OPENAI_API_KEY="your-openai-api-key"           # For OpenAI GPT
ANTHROPIC_API_KEY="your-anthropic-api-key"     # For Claude
HUGGINGFACE_API_KEY="your-hf-api-key"          # For HuggingFace API (optional)

# Optional: Set environment (defaults to development)
ENVIRONMENT="development"

3. Start Infrastructure (Optional)

For production storage backends:

# Start JanusGraph and Milvus (optional, for production storage)
cd docker
docker-compose up -d

# Wait for services to initialize (2-3 minutes)
docker-compose logs -f

For development, you can use lightweight storage backends (SQLite/JSON) that don't require external services.

4. Basic Usage

from memory_core.core.knowledge_engine import KnowledgeEngine
from memory_core.model.knowledge_node import KnowledgeNode

# Initialize the system
engine = KnowledgeEngine()
engine.connect()

# Create knowledge from text
node = KnowledgeNode(
    content="Machine learning is a subset of artificial intelligence",
    source="AI Textbook",
    rating_truthfulness=0.9
)

# Save to knowledge graph
node_id = engine.save_node(node)
print(f"Created knowledge node: {node_id}")

# Retrieve and explore
retrieved = engine.get_node(node_id)
print(f"Content: {retrieved.content}")

5. CLI Management (v0.4.0+) & Orchestrator Features (v0.5.0+)

Memory Engine includes a comprehensive CLI for production management:

# Initialize a new Memory Engine instance
memory-engine init --backend=sqlite --embedding=sentence_transformers

# Check system health
memory-engine health-check --detailed

# Migrate between storage backends
memory-engine migrate --from=sqlite --to=janusgraph --verify

# Export knowledge graph data
memory-engine export --format=json --output=backup.json --include-metadata

# Import data from various formats
memory-engine import --file=data.json --merge-duplicates

# Create system backups
memory-engine backup --strategy=full --compression=gzip

# Restore from backup
memory-engine restore --backup=backup_12345 --clear-existing

# Manage plugins
memory-engine plugins list --type=storage
memory-engine plugins install custom-backend

# Configuration management
memory-engine config show --section=storage
memory-engine config set storage.backend janusgraph
memory-engine config validate

# System status
memory-engine status
memory-engine version

# Orchestrator Integration (v0.5.0+)
# Start streaming MCP operations
memory-engine mcp stream-query --query="knowledge about AI" --batch-size=50

# Manage event system
memory-engine events list --status=pending
memory-engine events replay --from-timestamp=1234567890

# Module registry management
memory-engine modules list --capabilities
memory-engine modules register my-custom-module

# Advanced GraphQL-like queries
memory-engine query build --type=nodes --filter="content contains 'AI'" --limit=10
memory-engine query execute --query-file=complex_query.json

📖 Documentation

Document	Description
📋 Setup Guide	Complete installation and configuration instructions
⚙️ Configuration	Basic configuration and environment setup
🔧 Advanced Configuration	Advanced configuration system
🏗️ Architecture	System architecture and component interactions
🏗️ Project Structure	Detailed project organization and structure
📡 API Reference	Complete API documentation including MCP interface
🔐 Security Framework	Authentication, RBAC, encryption, and privacy controls
🔧 Troubleshooting	Common issues and solutions

💻 Examples

Explore practical examples in the examples/ directory:

Basic Usage: Core operations and workflows
Knowledge Extraction: Text processing and knowledge extraction
MCP Integration: Using the Module Communication Protocol
Security Framework: Authentication, RBAC, encryption, and privacy controls
Advanced Queries: Complex querying and analytics
Knowledge Synthesis: Question answering and insight discovery

Run Examples

# Ensure infrastructure is running
cd docker && docker-compose up -d

# Run basic usage example
python examples/basic_usage.py

# Run knowledge extraction demo
python examples/knowledge_extraction.py

# Test MCP interface
python examples/mcp_client_example.py

# Try configuration system
python examples/config_example.py

🧪 Testing

Memory Engine includes a comprehensive test suite organized by type:

# Run all tests
./scripts/test.sh all

# Run only unit tests (fast, no external dependencies)
./scripts/test.sh unit

# Run integration tests (requires JanusGraph and Milvus)
./scripts/test.sh integration

# Run tests with coverage report
./scripts/test.sh coverage

# Run specific test file
./scripts/test.sh --file config_manager

Test organization:

Unit Tests (tests/unit/): Fast, isolated tests
Integration Tests (tests/integration/): Tests requiring external services
Component Tests (tests/): End-to-end component testing

🏗️ Architecture

Memory Engine uses a sophisticated multi-layer architecture:

┌─────────────────────────────────────────────────────────────────┐
│                    Application Layer                            │
├─────────────────┬─────────────────┬─────────────────┬───────────┤
│   Python API    │   MCP Interface │  Knowledge Agent│ REST API  │
├─────────────────┴─────────────────┴─────────────────┴───────────┤
│                    Knowledge Engine Core                        │
├─────────────────┬─────────────────┬─────────────────┬───────────┤
│   Knowledge     │   Relationship  │    Versioning   │  Rating   │
│   Processing    │   Extraction    │    Manager      │  System   │
├─────────────────┼─────────────────┼─────────────────┼───────────┤
│   Graph Store   │   Vector Store  │   Embedding     │  LLM API  │
│  (JanusGraph)   │   (Milvus)      │   Manager       │ (Gemini)  │
└─────────────────┴─────────────────┴─────────────────┴───────────┘

Core Components

Modular Graph Storage: Multiple backend options (JanusGraph, SQLite, JSON file)
Vector Database (Milvus): Enables semantic similarity search
Embedding System: Generates and manages vector representations
Processing Pipeline: Extracts and structures knowledge from text
Versioning System: Tracks changes and enables rollbacks
MCP Interface: Standardized API for external integration

Storage Backend Options

Choose the storage backend that fits your deployment needs:

🏢 JanusGraph: Production-grade distributed graph database
💾 SQLite: Single-user deployments with SQL capabilities
📄 JSON File: Development and testing with human-readable storage

🔧 Technology Stack

Component	Technology	Purpose
Graph Storage	JanusGraph / SQLite / JSON	Knowledge relationships
Vector Database	Milvus / ChromaDB / NumPy	Similarity search
LLM Providers	Gemini / OpenAI / Anthropic / Ollama / HuggingFace	Knowledge extraction
Embedding Providers	Gemini / OpenAI / Sentence Transformers / Ollama	Vector generation
Agent Framework	Google ADK	Conversational interfaces
Web Framework	FastAPI	REST API endpoints
Language	Python 3.8+	Core implementation

🧪 Development

Running Tests

# Unit tests only
pytest tests/ -k "not integration" -v

# All tests (requires infrastructure)
pytest tests/ -v

# With coverage
pytest tests/ --cov=memory_core --cov-report=html

Development Setup

# Install development dependencies
pip install pytest pytest-cov black isort mypy

# Format code
black memory_core/ tests/
isort memory_core/ tests/

# Type checking
mypy memory_core/

# Pre-commit hooks
pip install pre-commit
pre-commit install

📊 Performance

Performance characteristics will vary depending on your hardware, data complexity, and configuration. We recommend testing with your specific use case and data to establish realistic benchmarks.

🤝 Contributing

We welcome contributions! Please see our contributing guidelines:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes with tests
Ensure all tests pass (pytest)
Format code (black . && isort .)
Commit changes (git commit -m 'Add amazing feature')
Push to branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Standards

Code Quality: All code must pass linting and type checking
Testing: Maintain >90% test coverage
Documentation: Update docs for any API changes
Performance: Benchmark performance-critical changes

📝 License

This project is licensed under the Hippocratic License 3.0 - an ethical source license that promotes responsible use of software while protecting human rights and environmental sustainability.

🆘 Support

Getting Help

📖 Documentation: Check the docs/ directory
🐛 Issues: Report bugs or request features via GitHub Issues
💬 Discussions: Join conversations in GitHub Discussions
🔧 Troubleshooting: See the troubleshooting guide

Community

Contributing: See CONTRIBUTING.md for guidelines
Code of Conduct: Please read our CODE_OF_CONDUCT.md
Security: Report security issues via SECURITY.md

Status

⚠️ Development Status: Alpha version - breaking changes expected
📝 Documentation: Basic setup and usage guides available
🧪 Testing: Core functionality tested, expanding coverage
🔧 Stability: Experimental - not recommended for production use yet

Memory Engine - Transforming information into intelligence 🧠✨

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
.github		.github
config		config
docker		docker
docs		docs
examples		examples
memory_core		memory_core
scripts		scripts
tests		tests
.gitignore		.gitignore
.mcp.json		.mcp.json
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
MANIFEST.in		MANIFEST.in
README.md		README.md
SECURITY.md		SECURITY.md
memory_engine_cli.py		memory_engine_cli.py
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.py		setup.py

License

Celebr4tion/memory-engine

Folders and files

Latest commit

History

Repository files navigation

Memory Engine

🌟 Overview

⚠️ Important Notice

🚧 Project Status

Vision

🎯 What Memory Engine Does

Core Functions

Key Features

🧠 AI & Language Models

⚡ Performance & Production

🛠️ Operations & Management

🔌 Extensibility

🔍 Knowledge Management

🔐 Security & Integration

🚀 Quick Start

Prerequisites

1. Installation

2. Environment Setup

3. Start Infrastructure (Optional)

4. Basic Usage

5. CLI Management (v0.4.0+) & Orchestrator Features (v0.5.0+)

📖 Documentation

💻 Examples

Run Examples

🧪 Testing

🏗️ Architecture

Core Components

Storage Backend Options

🔧 Technology Stack

🧪 Development

Running Tests

Development Setup

📊 Performance

🤝 Contributing

Development Standards

📝 License

🆘 Support

Getting Help

Community

Status

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 4

Uh oh!

Languages