🚀 An intelligent medical literature Q&A system based on advanced RAG technology
An intelligent document Q&A platform that integrates hybrid retrieval, RRF result fusion, and multi-model support, specifically designed for medical literature processing and knowledge retrieval.
- 🚀 Advanced RAG Architecture: Adopts industry-leading Retrieval-Augmented Generation (RAG) technology to ensure the accuracy, relevance, and traceability of answers.
- 🧠 Intelligent Document Processing:
- Multi-modal Parsing: Not only extracts text but also understands complex structures like tables and images in PDFs.
- Dual-mode Smart Chunking: Prioritizes semantic chunking using
HybridTextSplitterand maintains stable "recursive" chunking as a fallback to ensure context integrity.
- 🎯 Small-to-Big Retrieval & 4-Path Hybrid Retrieval:
- Small-to-Big Architecture: Revolutionary retrieval strategy that uses small chunks for precise matching and parent chunks for complete context, optimizing both cost and quality.
- Multi-dimensional Recall: Simultaneously performs four retrieval paths—
VECTOR(semantic),CONTENT(full-text keyword),SUMMARY(summary keyword), andKEYWORDS(keyword list)—to maximize recall and precision. - RRF Fusion: Uses the Reciprocal Rank Fusion (RRF) algorithm to intelligently merge results from multiple paths.
- Smart Switching: Automatically switches from small chunks (for retrieval) to parent chunks (for generation) to ensure optimal context.
- 🔍 AI Enhancement & Optimization:
- Query Transformation: Utilizes LLMs to rewrite and expand user queries for better matching with the knowledge base.
- AI Reranking: After retrieval, a more powerful AI model performs a second-pass "close reading" and reranking to ensure the final answer is based on the most relevant, high-quality content.
- 💬 Enterprise-grade Q&A Experience:
- Streaming Output: Real-time streaming of answers to enhance user interaction.
- Precise Source Tracing: All answers provide clear literature sources for easy verification.
- High Scalability: Asynchronous and modular design makes it easy to integrate new models and features.
- Python 3.9+
- Node.js 16+
- 16GB+ RAM recommended
- 100GB+ available storage space
# Install base dependencies
pip install -r requirements.txtCopy and edit the environment variables file:
cp .env.example .env
# Edit the .env file to configure necessary parameterspython run.py
python start_celery_worker.py // For the queueThe backend API will start at http://localhost:8001
cd frontend
npm install
npm run devThe frontend interface will start at http://localhost:3001
To help you understand the internal workings of our system, we provide a series of detailed component analysis documents:
- RAG System Architecture Overview - Recommended to read first for a high-level overview of the system.
- Usage Guide - Detailed instructions on how to use the web interface and API.
- Installation and Configuration Guide - Detailed environment setup and installation steps.
- API Reference - Complete API documentation and usage examples.
- System Architecture - The original technical architecture, component design, and development guide.
smart-rag/
├── app/ # Application backend code (FastAPI)
│ ├── api/ # API routes
│ ├── core/ # Core functionalities (Config, Session Mgmt, etc.)
│ ├── embeddings/ # Embeddings and text chunking module
│ ├── metadata/ # Metadata generation module (summaries, keywords)
│ ├── models/ # Pydantic data models
│ ├── processors/ # Document parsing and processing
│ ├── retrieval/ # Core retrieval module (4-path recall, fusion, reranking)
│ ├── services/ # Business logic services
│ ├── storage/ # Database and vector store
│ ├── tests/ # Test code
│ ├── utils/ # Utility functions
│ └── workflow/ # RAG workflow and LLM clients
├── data/ # Data directory (vector DB, uploads, etc.)
├── docs/ # Project documentation (Chinese)
├── docs_en/ # Project documentation (English)
│ ├── rag_components/ # RAG core component deep dives
│ ├── RAG_architecture_overview.md
│ ├── ...
│ └── ...
├── frontend/ # Frontend code (React)
├── logs/ # Log files
├── scripts/ # Utility scripts
├── .env.example # Environment variable example
├── docker-compose.yml # Docker configuration
├── requirements.txt # Python dependencies
└── run.py # Backend startup script
This project is licensed under the Apache License 2.0 with additional commercial use restrictions. See the LICENSE file for details.
- ✅ Backend Server Commercial Use: Permitted for direct commercial purposes
- ❌ SaaS Service Restriction: Not permitted without separate commercial license
⚠️ Copyright Attribution: Required for all commercial services (unless separately licensed)
For commercial licensing inquiries, please contact: hqzhon@gmail.com
For questions, suggestions, or commercial licensing:
- Email: hqzhon@gmail.com
- Telegram: @hqzhon