Deployed - https://med-bot-ephi.onrender.com
MediBot is an AI-powered medical chatbot that leverages state-of-the-art language models and vector search to answer user queries based on a curated set of medical PDF documents. It uses Streamlit for the user interface, LangChain for LLM orchestration, and FAISS for efficient vector search.
- Features
- Project Structure
- Setup Instructions
- How It Works
- Customization
- Troubleshooting
- Disclaimer
- License
- 📄 Medical PDF Knowledge Base: Answers are grounded in the provided medical PDFs.
- 🤖 Modern LLM Integration: Uses Groq LLaMA-3.1-8B-Instant for high-quality responses.
- 🔎 Semantic Search: FAISS-powered vector search for relevant context retrieval.
- 🖥️ Streamlit UI: Simple, interactive chat interface.
- 🧩 Custom Prompting: Easily modify the prompt template for different behaviors.
.
├── .env
├── connect_memory_with_llm.py
├── create_memory_for_llm.py
├── app.py
├── Pipfile
├── Pipfile.lock
├── data/
│ ├── clinical_medicine_ashok_chandra.pdf
│ ├── ...
│ └── vectorstore/
│ └── db_faiss/
└── ...
- MediBot.py: Main Streamlit app for the chatbot.
- create_memory_for_llm.py: Script to process PDFs and build the FAISS vector store.
- connect_memory_with_llm.py: Script to test the QA chain in the terminal.
- data/: Folder containing all source PDFs.
- vectorstore/: Stores the FAISS vector database.
git clone <your-repo-url>
cd medical_chatBotIt's recommended to use pipenv (or you can use pip):
pipenv install
pipenv shellOr, using pip:
pip install -r requirements.txtCreate a .env file in the project root with your Groq API key:
GROQ_API_KEY=your_groq_api_key_here
Alternatively, set the environment variable in your shell:
export GROQ_API_KEY=your_groq_api_key_here- Place all your medical PDF files inside the
data/directory.
Run the following script to process PDFs and build the FAISS vector store:
python create_memory_for_llm.pyThis will:
- Load all PDFs from
data/ - Split them into text chunks
- Generate embeddings using the sentence-transformers/all-MiniLM-L6-v2 model
- Store the embeddings in
vectorstore/db_faiss
Start the Streamlit app:
streamlit run MediBot.pyOpen the provided local URL in your browser to interact with MediBot.
- Document Loading: All PDFs in
data/are loaded and split into manageable text chunks. - Embedding Generation: Each chunk is embedded using the
sentence-transformers/all-MiniLM-L6-v2model. - Vector Store: Embeddings are stored in a FAISS vector database for efficient similarity search.
- Query Handling: When a user asks a question, the most relevant chunks are retrieved from FAISS.
- LLM Response: The context and question are sent to the Groq LLaMA-3.1-8B-Instant model via the Groq API.
- Answer Display: The answer is shown in the Streamlit chat interface, along with the source documents if needed.
- Prompt Template:
Modify the prompt inMediBot.pyorconnect_memory_with_llm.pyto change the chatbot's behavior. - Model Selection:
Change theMODEL_NAMEvariable to use a different Groq LLaMA model. - Chunk Size:
Adjustchunk_sizeandchunk_overlapincreate_memory_for_llm.pyfor different granularity.
- FAISS Not Found:
Ensure you have runcreate_memory_for_llm.pybefore starting the chatbot. - API Key Issues:
Double-check your Groq API key in.envor your environment. - Dependency Errors:
Make sure all dependencies are installed. Usepipenvorpipas described above. - CUDA/CPU Issues:
If running on a machine without GPU, ensure models are set to run on CPU.
This chatbot provides AI-generated information and should not be considered a substitute for professional medical advice. Always consult a qualified doctor for medical concerns.
This project is for educational and research purposes only. Please check the licenses of the models and datasets you use.