This Flask application provides an AI assistant that can extract and query information from either Wikipedia articles or PDF documents using a Retrieval-Augmented Generation (RAG) pipeline.
- Upload PDF documents or provide Wikipedia URLs for knowledge extraction
- Ask questions about the content using natural language
- Dark-themed user interface with responsive design
- Session management to maintain context between questions
- Python 3.8 or higher
- Ollama running locally with the llama3.2 model installed
- Install from ollama.ai
- Run
ollama pull llama3.2to download the model
- Required Python packages (see requirements.txt)
- Clone this repository:
git clone https://github.com/MONARCH1108/AI_Rag_Assist
cd AI_Rag_Assist
- Create a virtual environment and activate it:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install the required packages:
pip install -r requirements.txt
-
Make sure Ollama is running in the background with the llama3.2 model installed.
-
Start the Flask application:
python Flask_app.py
-
Open your web browser and navigate to
http://127.0.0.1:5000/ -
App with out Flask Interface
python app.py
-
Wikipedia Mode:
- Enter a Wikipedia URL (e.g., https://en.wikipedia.org/wiki/Artificial_intelligence)
- Click "Load Article" and wait for processing to complete
-
PDF Mode:
- Click the "PDF" tab
- Select a PDF file from your computer
- Click "Upload PDF" and wait for processing to complete
-
Asking Questions:
- Once content is loaded, the chat input will be enabled
- Type your question and press Enter or click the send button
- The AI will respond based on the content of the loaded document
-
Starting Over:
- Click "Clear Session" to reset the application and load new content
This application uses:
- LangChain for document processing and the RAG pipeline
- HuggingFace embeddings (sentence-transformers/paraphrase-MiniLM-L6-v2)
- Ollama's local LLM (llama3.2) for answering questions
- ChromaDB for vector storage
- Flask for the web server
- Processing large documents may take time, especially on systems with limited resources
- The quality of answers depends on both the content provided and the capabilities of the llama3.2 model
- For best results, ask specific questions related to the content of the loaded document