The Personal RAG Server is a specialized Retrieval Augmented Generation system designed for private and efficient knowledge management, with particular focus on philosophical texts in German and English. It provides secure REST API access to interact with your document collection, enhanced with advanced semantic search capabilities and philosophical assistant integration.
- Philosophical Assistant Platform: Integrated Pinecone Assistants for conversational philosophical reasoning across four worldviews (Idealismus, Materialismus, Realismus, Spiritualismus)
- Advanced Model Selection: DeepSeek Reasoner as default for philosophical reasoning, with support for multiple LLM providers
- Hybrid Search Technology: Combines dense and sparse vector representations for superior retrieval performance
- German Language Optimization: Specialized embedding models for German philosophical texts
- Philosophical Question Detection: Automatically routes philosophical queries to specialized models
- Category-Based Organization: Organizes documents by philosophical worldviews and categories
- Document Management: Comprehensive tools for uploading, updating, and managing documents
- Advanced Metadata: Enhanced document metadata with philosophical concept tagging
- Command-Line Interface: Unified CLI for all RAG and assistant management operations
- Diagnostic Tools: Comprehensive system diagnostics and query performance analysis
- Template System: Structured philosophical response templates with worldview-specific adaptations
- Vector Database: Pinecone for efficient semantic search and assistant integration
- Embedding Models: Specialized models for German philosophical texts
- LLM Integration: DeepSeek Reasoner for advanced philosophical reasoning, with fallback support for other models
- Assistant Platform: Pinecone Assistant API for conversational philosophical experiences
- API Framework: FastAPI for modern, async Python REST API
- Storage: MongoDB for document metadata and system data
The Personal RAG Server follows a modular architecture with clear separation of concerns to ensure maintainability and extensibility.
-
API Layer: FastAPI endpoints handle HTTP requests and provide OpenAPI documentation
-
Service Layer: Core business logic components include:
- RAG Service: Orchestrates the RAG pipeline
- Assistant Manager: Manages Pinecone Assistants for philosophical reasoning
- Embedding Service: Generates vector embeddings using specialized models
- LLM Service: Interfaces with DeepSeek for response generation
- Vector Store Manager: Handles Pinecone vector database operations
- File Processor: Processes and chunks documents
- Knowledge Base Scanner: Scans and catalogs documents
-
Database Layer:
- MongoDB: Stores metadata, conversations, and system data
- Pinecone: Vector database for semantic search with hybrid vector support and assistant integration
-
CLI Layer:
- RAG CLI: Command-line interface for system management
- Assistant CLI: Comprehensive assistant management commands
- Diagnostic Tools: Query testing and system analysis tools
-
Document Processing:
- Documents are uploaded via CLI or API
- Text is processed, chunked, and enhanced with metadata
- Both dense and sparse vectors are generated
- Vectors and metadata are stored in Pinecone
-
Query Processing:
- User query is analyzed for philosophical content
- Query is converted to dense and sparse vectors
- Hybrid search retrieves relevant document chunks
- Retrieved context is passed to LLM with the query
- LLM generates response based on context and query
-
Assistant Interaction:
- Philosophical assistants maintain conversation context
- Each assistant applies worldview-specific reasoning
- DeepSeek Reasoner provides advanced philosophical analysis
- Template system ensures structured, consistent responses
app/
├── api/ # REST API endpoints
├── core/ # Core configuration and utilities
├── db/ # Database connections and operations
├── models/ # Pydantic data models
├── services/ # Business logic services
└── utils/ # Utility functions
assistants/
├── config/ # Assistant configurations by worldview
├── templates/ # Philosophical response templates
└── pinecone_assistant_manager.py # Assistant management system
scripts/
├── data_import/ # Data import utilities
├── phase2/ # Hybrid search implementation
├── rag-cli/ # CLI tools for RAG and assistant management
└── testing/ # Testing and diagnostic tools
- Python: 3.9+ (3.11 recommended)
- MongoDB: Running instance (local or remote)
- Pinecone: Account with API key and Assistant API access
- DeepSeek: API key for LLM access (Reasoner model recommended)
- Hardware: Apple Silicon Mac recommended for optimized embedding generation
-
Clone the repository:
git clone https://github.com/yourusername/personal-rag-server.git cd personal-rag-server -
Install dependencies:
pip install -r requirements.txt -
Create a
.envfile with your configuration:python fix_env.py -
Edit the
.envfile with your API keys and settings.
Create a .env file in the project root with the following settings:
# MongoDB Settings
MONGODB_URL=mongodb://localhost:27017
MONGODB_DB_NAME=rag_server
# Pinecone Settings
PINECONE_API_KEY=your_pinecone_api_key
PINECONE_ENVIRONMENT=your_pinecone_region
PINECONE_INDEX_NAME=rag-server-hybrid
PINECONE_HOST=your_pinecone_host
# DeepSeek Settings
DEEPSEEK_API_KEY=your_deepseek_api_key
DEEPSEEK_API_URL=https://api.deepseek.com/v1
DEEPSEEK_MODEL=deepseek-chat
DEEPSEEK_PHILOSOPHY_MODEL=deepseek-reasoner
# Embedding Service Settings
LOCAL_EMBEDDING_SERVICE_URL=http://localhost:8001
EMBEDDINGS_DIMENSION=1024
EMBEDDINGS_MODEL=multilingual-e5-large
-
Start the local embeddings service (if using):
cd personal-embeddings-service docker-compose up -d -
Start the FastAPI server:
uvicorn app.main:app --reload -
Run the integration tests to verify setup:
python -m scripts.testing.integration_tests
The RAG CLI provides a unified interface for managing the RAG system and philosophical assistants. All commands are run using the scripts/rag-cli.sh script.
# Get general help
./scripts/rag-cli.sh --help
# Get help for a specific command group
./scripts/rag-cli.sh kb --help
./scripts/rag-cli.sh assistants --helpUpload documents from a directory to the vector database:
./scripts/rag-cli.sh kb upload --source /path/to/knowledge_baseYou can specify categories to upload:
./scripts/rag-cli.sh kb upload --source /path/to/knowledge_base --categories Materialismus,IdealismusUpdate existing documents in the database:
./scripts/rag-cli.sh kb update --source /path/to/knowledge_base --categories Materialismus --delete-existingDelete documents by category:
./scripts/rag-cli.sh kb delete --categories Materialismus,IdealismusGet statistics about the indexed content:
./scripts/rag-cli.sh kb statsView statistics for specific categories:
./scripts/rag-cli.sh kb stats --categories Materialismus,IdealismusShow all supported LLM models for philosophical assistants:
./scripts/rag-cli.sh assistants modelsThis displays available models with DeepSeek Reasoner as the recommended default for philosophical reasoning.
# List all assistants
./scripts/rag-cli.sh assistants list
# Create a new philosophical assistant (uses deepseek-reasoner by default)
./scripts/rag-cli.sh assistants create my-philosophy-ai Idealismus
# Create with specific model
./scripts/rag-cli.sh assistants create my-claude-ai Materialismus --model claude-3-5-sonnet
# Delete an assistant
./scripts/rag-cli.sh assistants delete my-test-assistant# Single question
./scripts/rag-cli.sh assistants chat my-philosophy-ai "Erkläre bitte den Idealismus"
# Interactive session with history
./scripts/rag-cli.sh assistants chat my-philosophy-ai "Hallo" \
--interactive \
--history-file chat_history.json# Upload documents to an assistant
./scripts/rag-cli.sh assistants add-files my-philosophy-ai \
books/plato.pdf books/schelling.pdf \
--worldview Idealismus
# List files uploaded to an assistant
./scripts/rag-cli.sh assistants list-files my-philosophy-ai
# Get context snippets without full responses
./scripts/rag-cli.sh assistants context my-philosophy-ai \
"Was bedeutet Idealismus?" \
--top-k 3 \
--worldview-filter IdealismusYou can query the system through the API or test queries via the CLI:
./scripts/rag-cli.sh search query "What is the concept of consciousness?" --category MaterialismusSearch without category filter:
./scripts/rag-cli.sh search query "Compare consciousness in materialism and idealism"Run integration tests to verify system functionality:
python -m scripts.testing.integration_tests --verboseTest a specific query:
python -m scripts.testing.integration_tests --query "What is the meaning of life?"Test if a question is philosophical:
python -m scripts.testing.integration_tests --test-philosophical "Does free will exist?"The system implements hybrid search combining both dense and sparse vector representations:
- Capture meaning and context regardless of exact word usage
- Use specialized German embedding models for better understanding of philosophical texts
- Handle nuanced concepts and relationships between ideas
- Capture exact word matches and lexical information
- Implement BM25 algorithm optimized for German philosophical texts
- Address variations in number formats (e.g., "12" vs "zwölf")
# Use hybrid search with custom alpha value (dense vs sparse weighting)
./scripts/rag-cli.sh search query "Welches sind die 12 Weltanschauungen?" --alpha 0.6The alpha parameter controls the weight balance between dense and sparse vectors:
alpha=1.0: 100% dense vectors (pure semantic search)alpha=0.0: 100% sparse vectors (pure lexical search)alpha=0.5: Equal weight to both (default)
The system includes four specialized philosophical assistants, each representing a distinct worldview:
-
Idealismus (Aurelian I. Schelling):
- Focuses on the primacy of ideas and spiritual forces
- Emphasizes creative processes originating in the spiritual world
- Uses elevated, enthusiastic language reflecting Schelling's philosophical style
-
Materialismus (Aloys I. Freud):
- Analyzes behavior through material and biological conditions
- Avoids spiritual terminology, focusing on measurable phenomena
- Adopts Freudian analytical approach to understanding human behavior
-
Realismus (Arvid I. Steiner):
- Balances spiritual and material perspectives
- Emphasizes unity of perception and conceptual understanding
- Focuses on karma, social development, and anthroposophical insights
-
Spiritualismus (Amara I. Steiner):
- Emphasizes spiritual hierarchies and angelic beings
- Focuses on inner development and soul exploration
- Discusses karma, reincarnation, and spiritual development
The system uses DeepSeek Reasoner as the default model for philosophical assistants due to its:
- Exceptional reasoning capabilities perfect for philosophical work
- Advanced chain-of-thought processing for complex arguments
- Superior handling of abstract concepts and logical reasoning
- Excellent German/English multilingual support
- Cost-effective performance while maintaining highest quality
Alternative models are available for specific use cases:
# Compare responses across different models
./scripts/rag-cli.sh assistants create materialismus-reasoner Materialismus # Default: deepseek-reasoner
./scripts/rag-cli.sh assistants create materialismus-claude Materialismus --model claude-3-5-sonnet
./scripts/rag-cli.sh assistants create materialismus-gpt4o Materialismus --model gpt-4oThe system automatically detects philosophical questions and routes them to specialized LLM models:
# Test if a question is philosophical
./scripts/rag-cli.sh diagnostics test-philosophical "What is the meaning of life?"Philosophical detection is based on:
- Content analysis of the question
- Presence of philosophical terms and concepts
- Question structure and complexity
When a philosophical question is detected, the system uses the DeepSeek Reasoner model, which is optimized for philosophical reasoning.
The assistant platform includes sophisticated template handling for structured philosophical responses:
- Gedankenfehler-Formulieren: For correcting philosophical misconceptions
- Gedankenfehler-Aspekte: For considering specific aspects in corrections
- Gedankenfehler-Glossar: For creating philosophical glossaries
- Gedankenfehler-Wiederholen: For creating variations of philosophical thoughts
Each template is adapted by the assistant to their specific worldview while maintaining the required structure.
Documents are automatically enhanced with rich metadata to improve search and filtering:
- Automatic detection of philosophical concepts mentioned in documents
- Tagging with standardized terminology
- Cross-referencing between related concepts
- Automatic extraction of author information from filenames
- Linking to author metadata and background information
- Grouping works by author
- Detection and normalization of number formats (digits and spelled-out)
- Creation of search-optimized fields for both formats
- Support for German number variations
Organize your document collection by philosophical worldviews and other categories:
# List all categories
./scripts/rag-cli.sh kb list-categories
# View documents in a category
./scripts/rag-cli.sh kb list-documents --category IdealismusCategories can be nested and documents can belong to multiple categories, enabling flexible organization of complex philosophical texts.
If a document isn't appearing in search results:
# Check if the document exists in the database
./scripts/rag-cli.sh diagnostics check-document --document-id "Rudolf_Steiner#Der_menschliche_und_der_kosmische_Gedanke_Zyklus_33_[GA_151]" --expected-category "Realismus"Solutions:
- Verify the document was uploaded with correct metadata
- Try searching with alternative query formulations
- Adjust the alpha parameter for hybrid search
- Use the improved document check command:
./scripts/rag-cli.sh diagnostics check-document-improved --document-id "Rudolf_Steiner#Der_menschliche_und_der_kosmische_Gedanke_Zyklus_33_[GA_151]" --expected-category "Realismus" --query-matchIf queries with different number formats (e.g., "12" vs "zwölf") yield different results:
Solutions:
- Ensure hybrid search is enabled (alpha between 0.3-0.7)
- Run the optimization tool to find optimal alpha:
python -m scripts.phase2.phase2_optimization_tuning --vectorizer bm25_vectorizer.pkl --output-dir optimization_resultsAssistant Creation Fails:
# Use dry-run to validate configuration
./scripts/rag-cli.sh assistants create test Idealismus --dry-runAssistant Chat Timeouts:
- Break complex philosophical questions into simpler parts
- DeepSeek Reasoner handles complexity well but may need time for deep reasoning
- Check assistant status with
listcommand
Model Selection Issues:
# Check available models
./scripts/rag-cli.sh assistants models
# Verify model compatibility
./scripts/rag-cli.sh assistants create test Idealismus --model deepseek-reasoner --dry-runIf you encounter configuration issues:
# Run the integration tests to verify configuration
python -m scripts.testing.integration_testsSolutions:
- Verify environment variables in .env file
- Check API keys for Pinecone and DeepSeek
- Ensure the Pinecone index has the correct dimension (1024)
- Verify Pinecone Assistant API access
The system includes comprehensive diagnostic tools:
Run a full diagnostic on the RAG system:
./scripts/rag-cli.sh diagnostics diagnose-retrievalThis checks:
- Document existence and retrieval
- Query performance with variations
- Category filter effectiveness
- LLM response quality
Analyze search performance with different configurations:
python -m scripts.phase2.TODO_phase2_hybrid_test --vectorizer bm25_vectorizer.pkl --category Realismus_Test --output evaluation_report.md-
Adjust Hybrid Search Parameters:
- For philosophical terminology: alpha = 0.4
- For general questions: alpha = 0.7
- For queries with numbers: alpha = 0.5
-
Optimize Assistant Interactions:
- Use DeepSeek Reasoner for complex philosophical discussions
- Switch to faster models (deepseek-chat) for simple queries
- Leverage context retrieval for fact-checking without full generation
-
Optimize Chunk Size:
- Smaller chunks (500-800 chars) for precise retrieval
- Larger chunks (1000-1500 chars) for more context
- Adjust overlap (200-300 chars) to maintain context
-
Batch Processing:
- Use larger batch sizes for document uploads
- Process large collections with the parallel option:
./scripts/rag-cli.sh kb upload --source /path/to/knowledge_base --parallel --workers 4- Pinecone Optimization:
- Increase
top_kparameter for recall-focused applications - Use namespace-based organization for cleaner separation
- Implement metadata filtering to narrow results
- Increase
| Command | Description | Example |
|---|---|---|
kb upload |
Upload documents to vector store | ./scripts/rag-cli.sh kb upload --source /path/to/docs |
kb update |
Update existing documents | ./scripts/rag-cli.sh kb update --source /path/to/docs --categories Philosophy |
kb delete |
Delete documents by category | ./scripts/rag-cli.sh kb delete --categories Materialismus |
kb stats |
View statistics about indexed content | ./scripts/rag-cli.sh kb stats |
kb list-categories |
List available categories | ./scripts/rag-cli.sh kb list-categories |
kb list-documents |
List documents in a category | ./scripts/rag-cli.sh kb list-documents --category Idealismus |
| Command | Description | Example |
|---|---|---|
assistants models |
List available LLM models | ./scripts/rag-cli.sh assistants models |
assistants list |
List all assistants | ./scripts/rag-cli.sh assistants list --output-format json |
assistants create |
Create new philosophical assistant | ./scripts/rag-cli.sh assistants create my-ai Idealismus --model deepseek-reasoner |
assistants delete |
Delete an assistant | ./scripts/rag-cli.sh assistants delete my-test-assistant |
assistants chat |
Interactive chat with assistant | ./scripts/rag-cli.sh assistants chat my-ai "Hello" --interactive |
assistants list-files |
List files uploaded to assistant | ./scripts/rag-cli.sh assistants list-files my-ai |
assistants add-files |
Upload files to assistant | ./scripts/rag-cli.sh assistants add-files my-ai file.pdf --worldview Idealismus |
assistants remove-files |
Remove files from assistant | ./scripts/rag-cli.sh assistants remove-files my-ai file-id-123 |
assistants context |
Get context snippets without responses | ./scripts/rag-cli.sh assistants context my-ai "query" --top-k 5 |
| Command | Description | Example |
|---|---|---|
search query |
Search for documents | ./scripts/rag-cli.sh search query "What is consciousness?" |
search optimize |
Optimize search parameters | ./scripts/rag-cli.sh search optimize --query-type philosophical |
| Command | Description | Example |
|---|---|---|
diagnostics diagnose-retrieval |
Run full system diagnosis | ./scripts/rag-cli.sh diagnostics diagnose-retrieval |
diagnostics check-document |
Check document existence | ./scripts/rag-cli.sh diagnostics check-document --document-id "doc_id" |
diagnostics test-philosophical |
Test philosophical detection | ./scripts/rag-cli.sh diagnostics test-philosophical "What is truth?" |
| Endpoint | Method | Description |
|---|---|---|
/api/v1/rag/query |
POST | Generate RAG response for a query |
/api/v1/rag/search |
POST | Search for documents |
/api/v1/rag/documents |
POST | Add documents to the system |
/api/v1/rag/documents/{document_id} |
DELETE | Delete a document |
| Endpoint | Method | Description |
|---|---|---|
/api/v1/assistants |
GET/POST | List/create assistants |
/api/v1/assistants/{assistant_id} |
GET/PUT/DELETE | Get/update/delete assistant |
/api/v1/threads |
GET/POST | List/create threads |
/api/v1/threads/{thread_id}/messages |
GET/POST | List/create messages |
/api/v1/threads/{thread_id}/runs |
POST | Run assistant on thread |
| Endpoint | Method | Description |
|---|---|---|
/api/v1/health |
GET | System health check |
/api/v1/auth/token |
POST | Get authentication token |
/docs |
GET | OpenAPI documentation |
| Variable | Description | Default |
|---|---|---|
MONGODB_URL |
MongoDB connection string | mongodb://localhost:27017 |
MONGODB_DB_NAME |
MongoDB database name | rag_server |
PINECONE_API_KEY |
Pinecone API key | - |
PINECONE_ENVIRONMENT |
Pinecone environment | - |
PINECONE_INDEX_NAME |
Pinecone index name | rag-server-hybrid |
PINECONE_HOST |
Pinecone host URL | - |
DEEPSEEK_API_KEY |
DeepSeek API key | - |
DEEPSEEK_API_URL |
DeepSeek API URL | https://api.deepseek.com/v1 |
DEEPSEEK_MODEL |
Default DeepSeek model | deepseek-chat |
DEEPSEEK_PHILOSOPHY_MODEL |
Model for philosophical queries | deepseek-reasoner |
LOCAL_EMBEDDING_SERVICE_URL |
Local embedding service URL | http://localhost:8001 |
EMBEDDINGS_DIMENSION |
Embedding vector dimension | 1024 |
EMBEDDINGS_MODEL |
Embedding model name | multilingual-e5-large |
| Model | Description | Best For |
|---|---|---|
deepseek-reasoner |
Advanced reasoning model (default) | Complex philosophical discussions |
deepseek-chat |
General purpose model | Quick Q&A, mixed content |
claude-3-5-sonnet |
Anthropic's reasoning model | Alternative reasoning, analysis |
gpt-4o |
OpenAI's general model | General use, fallback |
gpt-4o-mini |
Lightweight OpenAI model | Simple queries, testing |
| Parameter | Description | Recommended Values |
|---|---|---|
alpha |
Dense vs sparse weight | 0.3-0.7 (default: 0.5) |
top_k |
Number of results to retrieve | 10-30 (default: 15) |
chunk_size |
Document chunk size | 500-1500 (default: 1000) |
chunk_overlap |
Overlap between chunks | 100-300 (default: 200) |
# 1. Upload philosophical documents
./scripts/rag-cli.sh kb upload --source /path/to/philosophical_texts
# 2. Check available models (deepseek-reasoner will be default)
./scripts/rag-cli.sh assistants models
# 3. Create philosophical assistants for each worldview
./scripts/rag-cli.sh assistants create idealism-reasoner Idealismus
./scripts/rag-cli.sh assistants create materialism-reasoner Materialismus
./scripts/rag-cli.sh assistants create realism-reasoner Realismus
./scripts/rag-cli.sh assistants create spiritualism-reasoner Spiritualismus
# 4. Upload relevant texts to each assistant
./scripts/rag-cli.sh assistants add-files idealism-reasoner \
texts/plato_republic.pdf texts/schelling_nature.pdf \
--worldview Idealismus
# 5. Start philosophical discussion
./scripts/rag-cli.sh assistants chat idealism-reasoner \
"Erkläre den Unterschied zwischen Platons Ideenlehre und Schellings Naturphilosophie" \
--interactive# Create assistants with different models for comparison
./scripts/rag-cli.sh assistants create ethics-reasoner Materialismus # Uses deepseek-reasoner
./scripts/rag-cli.sh assistants create ethics-claude Materialismus --model claude-3-5-sonnet
./scripts/rag-cli.sh assistants create ethics-gpt4o Materialismus --model gpt-4o
# Test complex philosophical question across models
question="How does materialism explain consciousness and free will in light of quantum mechanics?"
./scripts/rag-cli.sh assistants chat ethics-reasoner "$question"
./scripts/rag-cli.sh assistants chat ethics-claude "$question"
./scripts/rag-cli.sh assistants chat ethics-gpt4o "$question"
# Compare context retrieval capabilities
./scripts/rag-cli.sh assistants context ethics-reasoner \
"consciousness materialism quantum" --top-k 5This comprehensive Personal RAG Server provides a complete platform for philosophical reasoning, combining advanced retrieval capabilities with sophisticated conversational AI to create an unparalleled tool for philosophical research and discussion.