LLaMA-RAG Demo

A Retrieval-Augmented Generation (RAG) system using LLaMA and FAISS, with a modern React frontend.

Project Structure

src/: Backend components
- retriever.py: FAISS-based document retriever
- generator.py: LLaMA-based text generator using llama-cpp-python
- rag_pipeline.py: RAG pipeline implementation
- api.py: FastAPI backend server
project/: Frontend React application
- src/: React components and logic
- public/: Static assets
data/: Document storage
models/: Model storage
- llama-2-7b/: LLaMA model files

Setup

Option 1: Local Setup

Backend Setup

Create a virtual environment:

python3 -m venv venv
source venv/bin/activate

Install dependencies:

pip install -r requirements.txt

Prepare the LLaMA model:
- Convert your LLaMA model to GGML/GGUF format using llama.cpp
- Place the converted model file (e.g., ggml-model.bin) in models/llama-2-7b/
Run the setup script:

python setup.py

Frontend Setup

Navigate to the project directory:

cd project

Install dependencies:

npm install

Option 2: Docker Setup

Ensure you have Docker and Docker Compose installed on your system.
Prepare the LLaMA model:
- Convert your LLaMA model to GGML/GGUF format using llama.cpp
- Place the converted model file (e.g., ggml-model.bin) in models/llama-2-7b/
Create necessary directories:

mkdir -p models/llama-2-7b uploads
touch models/llama-2-7b/.gitkeep uploads/.gitkeep

Build and start the containers:

docker-compose up --build

Running the Application

Local Development

Backend

Start the FastAPI server:

python api_main.py

The API will be available at http://localhost:8000

Frontend

In a new terminal, start the Vite development server:

cd project
npm run dev

The UI will be available at http://localhost:5173

Docker Deployment

Start the application:

docker-compose up

Access the application:
- Frontend: http://localhost:5173
- Backend API: http://localhost:8000

To stop the application:

docker-compose down

Usage

Open your browser and navigate to http://localhost:5173
Upload a PDF document using the file upload interface
Once the document is processed, you can start asking questions about its content
The AI will respond with answers based on the document's content

API Endpoints

POST /upload: Upload a PDF document for processing
POST /query: Query the processed document
GET /health: Check API health status

Customization

Backend

Modify the retriever's model in src/retriever.py
Adjust generation parameters in src/generator.py
Customize the prompt template in src/rag_pipeline.py
Use a different model by updating the model_path in src/generator.py

Frontend

Customize the UI components in project/src/components/
Modify the API integration in project/src/App.tsx
Update styles in project/src/index.css

Model Conversion

To convert your LLaMA model to GGML/GGUF format:

Clone llama.cpp:

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp

Convert your model:

python convert.py --outfile models/llama-2-7b/ggml-model.bin --outtype f16 /path/to/your/llama/model

For more details, refer to the llama.cpp documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
plan		plan
services		services
uploads		uploads
.dockerignore		.dockerignore
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLaMA-RAG Demo

Project Structure

Setup

Option 1: Local Setup

Backend Setup

Frontend Setup

Option 2: Docker Setup

Running the Application

Local Development

Backend

Frontend

Docker Deployment

Usage

API Endpoints

Customization

Backend

Frontend

Model Conversion

About

Uh oh!

Releases

Packages

Languages

TomatoFT/llama-rag-app

Folders and files

Latest commit

History

Repository files navigation

LLaMA-RAG Demo

Project Structure

Setup

Option 1: Local Setup

Backend Setup

Frontend Setup

Option 2: Docker Setup

Running the Application

Local Development

Backend

Frontend

Docker Deployment

Usage

API Endpoints

Customization

Backend

Frontend

Model Conversion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages