A fully containerized Graph RAG application for cybersecurity threat intelligence, powered by local LLMs via Ollama.
┌─────────────────────────────────────────────────────────────────────────────┐
│ Docker Network │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌───────────┐ │
│ │ Frontend │ │ Backend │ │ Neo4j │ │ Ollama │ │
│ │ (Nginx) │────▶│ (FastAPI) │────▶│ (Graph DB) │ │ (LLM) │ │
│ │ Port 8501 │ │ Port 8000 │ │ Port 7474 │ │ Port 11434│ │
│ └─────────────┘ └──────┬──────┘ └─────────────┘ └───────────┘ │
│ │ ▲ │
│ └───────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Data Ingestion (Job) │ │
│ │ Loads MITRE ATT&CK data into Neo4j on startup │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
- Image:
neo4j:5.15-community - Purpose: Stores threat intelligence as a knowledge graph
- Ports:
7474- Browser UI7687- Bolt protocol
- Data: Persisted via Docker volume
- Image:
ollama/ollama:latest - Model:
mistral:7b - Purpose:
- LLM for natural language understanding and generation
- Embedding generation via
nomic-embed-text
- Port:
11434 - Note: Runs on CPU (no GPU available), 8GB memory limit
- Image: Custom Python image
- Purpose:
- REST API for frontend
- RAG pipeline orchestration
- Cypher query generation from natural language
- Graph traversal and context retrieval
- Port:
8000 - Endpoints:
POST /query - Natural language query GET /graph/stats - Graph statistics GET /graph/actors - List threat actors GET /graph/techniques - List techniques GET /graph/actors/{name}/techniques - Get actor's techniques GET /graph/actors/{name}/attack-path - Get actor's kill chain GET /graph/techniques/{id}/mitigations - Get technique mitigations GET /graph/search?q= - Search across all entities GET /graph/visualize - Get graph data for visualization GET /health - Health check
- Image: Nginx Alpine
- Purpose: Modern web UI for querying threat intelligence
- Port:
8501(mapped from internal port 80) - Tech Stack:
- HTML5/CSS3/JavaScript
- jQuery for AJAX requests
- Chart.js for statistics visualization
- vis-network for interactive graph visualization
- marked.js for markdown rendering
- Features:
- Query Page: Natural language queries with example suggestions
- Explore Page: Browse threat actors, techniques, and search
- Graph Map: Interactive network visualization with filtering
- Statistics: Charts showing node/relationship distribution
- Image: Custom Python image
- Purpose: One-time job to load MITRE ATT&CK data
- Data Sources:
- MITRE ATT&CK Enterprise (STIX format)
- Relationships: Actors → Techniques → Tactics → Mitigations
| Label | Properties | Description |
|---|---|---|
ThreatActor |
id, name, description, aliases, country | APT groups, criminal orgs |
Technique |
id, name, description, platforms, detection | ATT&CK techniques |
Tactic |
id, name, description, shortname | ATT&CK tactics (kill chain phases) |
Malware |
id, name, description, platforms | Malware families |
Tool |
id, name, description | Legitimate tools used maliciously |
Mitigation |
id, name, description | Defensive measures |
(:ThreatActor)-[:USES]->(:Technique)
(:ThreatActor)-[:USES]->(:Malware)
(:ThreatActor)-[:USES]->(:Tool)
(:Technique)-[:BELONGS_TO]->(:Tactic)
(:Technique)-[:MITIGATED_BY]->(:Mitigation)
(:Malware)-[:EMPLOYS]->(:Technique)
(:Tool)-[:EMPLOYS]->(:Technique)
User Query
│
▼
┌─────────────────────┐
│ 1. Query Analysis │ ← Ollama extracts intent & entities
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ 2. Graph Retrieval │ ← Cypher query against Neo4j
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ 3. Context Building │ ← Combine graph results + embeddings
└──────────┬──────────┘
│
▼
┌─────────────────────┐
│ 4. Response Gen │ ← Ollama generates final answer
└─────────────────────┘
| Natural Language Query | Graph Retrieval |
|---|---|
| "What techniques does APT29 use?" | Match path from actor to techniques |
| "How do I defend against phishing?" | Find mitigations for T1566 |
| "Which actors target healthcare?" | Filter actors by target industry |
| "Show the kill chain for Lazarus" | Traverse actor → techniques → tactics |
graph-rag/
├── docker-compose.yml
├── .env.example
├── Makefile # Useful commands
├── README.md
│
├── backend/
│ ├── Dockerfile
│ ├── requirements.txt
│ └── app/
│ ├── main.py # FastAPI app
│ ├── config.py # Settings
│ ├── routers/
│ │ ├── query.py # Query endpoints
│ │ └── graph.py # Graph endpoints
│ ├── services/
│ │ ├── neo4j_service.py # Graph operations
│ │ ├── ollama_service.py# LLM operations
│ │ └── rag_pipeline.py # RAG orchestration
│ └── models/
│ └── schemas.py # Pydantic models
│
├── frontend/
│ ├── Dockerfile
│ ├── nginx.conf # Nginx configuration
│ ├── index.html # Main HTML page
│ ├── css/
│ │ └── style.css # Styles
│ └── js/
│ └── app.js # JavaScript application
│
├── ingestion/
│ ├── Dockerfile
│ ├── requirements.txt
│ ├── ingest.py # Main ingestion script
│ └── parsers/
│ └── mitre_attack.py # MITRE ATT&CK parser
│
└── data/
└── .gitkeep # Downloaded data stored here
# Neo4j
NEO4J_URI=bolt://neo4j:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=threatintel123
# Ollama
OLLAMA_HOST=http://ollama:11434
OLLAMA_MODEL=mistral:7b
OLLAMA_EMBED_MODEL=nomic-embed-text
# Backend
LOG_LEVEL=INFO- Docker & Docker Compose installed on target machine
- At least 16GB RAM (for Ollama + Neo4j)
- ~10GB disk space
# Clone repository
git clone https://github.com/encryptedtouhid/graph-rag.git
cd graph-rag
# Copy environment file
cp .env.example .env
# Start all services
docker-compose up -d
# Watch logs
docker-compose logs -f
# Access services
# - Frontend: http://localhost:8501
# - Backend API: http://localhost:8000/docs
# - Neo4j Browser: http://localhost:7474- Ollama init container will auto-pull
mistral:7bandnomic-embed-textmodels - Ingestion job loads MITRE ATT&CK data into Neo4j
- System ready when all health checks pass
make help # Show all available commands
make build # Build all Docker images
make up # Start all services in background
make up-logs # Start all services with logs
make down # Stop all services
make logs # View logs from all services
make logs-backend # View backend logs only
make status # Show status of all services
make restart # Restart all services
make clean # Stop and remove containers, volumes, images
make rebuild # Clean rebuild and start
make shell-backend # Open shell in backend container
make shell-neo4j # Open cypher-shell in Neo4j
make reset-db # Clear database and re-run ingestion- Add IOC ingestion (AlienVault OTX)
- Add CVE/NVD data
- Implement semantic search with vector index
- Add query caching
- Add authentication
- Kubernetes deployment manifests
- GPU support for Ollama
