Skip to content

An Improved Langchain RAG Tutorial (v2) with local LLMs, database updates, and testing.

Notifications You must be signed in to change notification settings

MartinTorres2099/rag-tutorial-v2-updated

 
 

Repository files navigation

rag-tutorial-v2 - Updated

This repository expands on the original rag-tutorial-v2 to add more features and improve the performance of the chatbot built with Langchain and a vector database. Below is a guide to set up the environment and run the application.


Description

Over a year ago, I attempted to create a chatbot using Langchain and a vector database. Although I could import files and parse them, I faced challenges getting useful information from the vector DB. Fast forward to today, and many repositories offer working implementations for chatting with documents.

I initially forked and tested a repo Web-LLM-Assistant-Llama-cpp-working but faced issues with importing larger files (greater than 200KB). After troubleshooting, I found rag-tutorial-v2, which worked well for smaller documents and served as a base for my updates.

I expanded the project with additional features that I wanted to see. Here's a video demo of the app in action.


Environment Setup

This project is set up to run on a Windows 10 machine. Follow the instructions below to recreate the environment on your machine.

1. Clone the Repository

First, create a directory to store your repositories and clone the project.

# Navigate to the root directory where your repos will be stored
cd \
mkdir gitrepos
cd gitrepos

# Clone the repository
git clone https://github.com/MartinTorres2099/rag-tutorial-v2-updated.git

The repository will be cloned to:
C:\gitrepos\rag-tutorial-v2-updated

2. Set Up the Python Virtual Environment

Navigate to the project directory and create a Python virtual environment:

cd C:\gitrepos\rag-tutorial-v2-updated
python -m venv venv  # Run only once to create your virtual environment

3. Activate the Virtual Environment

To activate the virtual environment on Windows:

venv\Scripts\activate.bat

4. Install the Required Packages

Install the required dependencies by running:

pip install -r requirements.txt

Additionally, install Flask and Langchain:

pip install Flask
pip install langchain-community

5. Deactivate the Virtual Environment

Once you are done working, deactivate the virtual environment with:

venv\Scripts\deactivate.bat

Modifying the Embedding Function

Update get_embedding_function.py to run locally by uncommenting and adjusting the code:

from langchain_community.embeddings.ollama import OllamaEmbeddings
from langchain_community.embeddings.bedrock import BedrockEmbeddings

def get_embedding_function():
    # Uncomment and configure to use cloud-based embeddings
    # embeddings = BedrockEmbeddings(credentials_profile_name="default", region_name="us-east-1")
    
    embeddings = OllamaEmbeddings(model="nomic-embed-text")
    return embeddings

Loading Documents

You can load different document types using Langchain's document loaders. Find more details here:

To install Ollama, follow the instructions on their official website.


Running the Application

Prerequisites

Ensure that the Ollama application is running on your machine before starting the app. You can create a batch file to automate the process.

1. Create a Batch File to Launch the Application

Create a batch file (start_app.bat) with the following content:

@echo off
echo This will launch the RAG application
timeout /t 2

echo Changing to rag directory
cd C:\rag-tutorial-v2
timeout /t 2

echo Activating Python virtual environment
call venv\Scripts\activate.bat
timeout /t 2

python app.py
echo Waiting for the app to close...
timeout /t 2

echo Deactivating Python virtual environment
call venv\Scripts\deactivate.bat
timeout /t 2

echo Thank you for using the RAG application, goodbye for now!
timeout /t 2

exit

2. Run the Web App

Execute the batch file to start the application. The app will run a local development server, accessible at:
http://127.0.0.1:5000/


Running the Program Without the Web Interface

You can also run the program manually without the web interface:

  1. Pull the Nomic embed model:
ollama pull nomic-embed-text
  1. Download and use a Mistral model:
ollama pull mistral-nemo  # Pull the Mistral model
  1. Serve the model:
ollama serve  # (Verify if needed)
  1. Run the model:
ollama run mistral-nemo
  1. Exit the model:
/bye
  1. Add or update documents in the database:
python populate_database.py
  1. Test the RAG system with known data:

Run the following to check how well the LLM answers questions based on the vector DB:

pytest -s  # Modify test_rag.py with known data

Uninstalling the Python Virtual Environment

To uninstall the virtual environment, first deactivate it:

venv\Scripts\deactivate.bat

Then, delete the virtual environment:

rm -r venv

Run the application on your local environment

In working with Flask, it would always run with development mode on regardless of how I was launching the application or how I changed variables to run in something other than debug. I installed Waitress and updated app.py to use that instead:

In the virtual environment, install Waitress:

pip install waitress

Update the app.py code:

if __name__ == "__main__":

    print("Starting Flask app with Waitress...")

    serve(app, host='0.0.0.0', port=5000)

Run the Flask app with Waitress:

python app.py

Updates

The code has been updated to allow you to ask a question and once it is answered, you have the option to ask another question with the loaded model or go back to the main index.html and choose a different model to ask a question under. The previous code always asked what model you wanted to use.

The app has now been updated so that the machine it is running under can be accessed from any other machines in the environment by inpoutting the IP address of the machine it is running under. I changed the port to 8080 and I am using Waitress to serve up the site. The next step is to ask if an answer cannot be found in the documents, should the app search online for an answer.

Additional Resources


Thank you for using this project! Feel free to contribute or make improvements.

About

An Improved Langchain RAG Tutorial (v2) with local LLMs, database updates, and testing.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 53.0%
  • Python 47.0%