Build and visualize knowledge graphs from PDF documents using NLP. No external APIs required.
./run.sh # macOS/Linux
# or
run.bat # WindowsThe system will:
- Extract text from PDFs in
src/docs/ - Build a knowledge graph using NLP (SpaCy)
- Start server at http://localhost:8000
- Automatic Graph Building - Extracts entities and relationships from PDFs
- Interactive Visualization - D3.js powered graph with zoom, pan, and drag
- Dual Storage - JSON (default) or Neo4j for large-scale graphs
- Auto-Rebuild - Fresh graph on every run
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Download NLP model
python -m spacy download en_core_web_sm
# Run
python src/main.pyPlace PDFs in src/docs/ and run the system
http://localhost:8000/visualize
/api/graph- Graph data/api/stats- Statistics/docs- API documentation
# Create .env file
cp .env.example .env
# Edit .env
USE_NEO4J=true
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your-password- NLP: SpaCy for entity extraction
- Graph: NetworkX
- Storage: JSON / Neo4j
- API: FastAPI
- Frontend: D3.js
Follows SOLID, KISS, and YAGNI principles with clean separation of concerns.