A local AI chatbot that answers questions about a pizza restaurant using customer reviews. Built with LangChain, Ollama, and vector embeddings - runs completely offline with no API costs.
- 100% Local AI - No internet required, no API costs
- RAG (Retrieval-Augmented Generation) - Uses actual review data to answer questions
- Interactive Chat Interface - Ask questions in a conversational loop
- Smart Review Retrieval - Finds the most relevant reviews for each question
- Vector Database - Fast similarity search using Chroma
- Local LLM - Powered by Llama 3.2 1B model
- Python 3.8 or higher
- Ollama installed and running
- Required Ollama models (see Setup section)
-
Clone or download this repository
git clone <your-repo-url> cd local-ai-agent
-
Create a virtual environment
python -m venv venv
-
Activate the virtual environment
Windows:
venv\Scripts\activate
macOS/Linux:
source venv/bin/activate -
Install dependencies
pip install -r requirements.txt
-
Install and start Ollama
- Download from ollama.ai
- Start the Ollama service
-
Pull required models
ollama pull llama3.2:1b ollama pull mxbai-embed-large
-
Ensure you have the data file
- Make sure
realistic_restaurant_reviews.csvis in the project directory - The vector database will be created automatically on first run
- Make sure
python main.py-------------------------------
Ask your question (q to quit): What do people say about the pizza?
The reviews show that people generally enjoy the pizza! Customers mention:
- "The pizza was delicious and fresh"
- "Great variety of toppings"
- "Crust was perfectly crispy"
- "Best pizza in town"
-------------------------------
Ask your question (q to quit): How is the service?
Based on the reviews, the service gets mixed feedback:
- Some customers praise the "friendly and fast service"
- Others mention "slow during busy hours"
- "Staff was helpful and attentive"
-------------------------------
Ask your question (q to quit): q
- "What do people say about the pizza?"
- "How is the service?"
- "What are the most common complaints?"
- "What do customers like most?"
- "How is the atmosphere?"
- "What about the prices?"
main.py- Main application with chat interfacevector.py- Vector database setup and retrievalrealistic_restaurant_reviews.csv- Sample restaurant review datarequirements.txt- Python dependencies
- LangChain - AI framework for building applications
- Ollama - Local AI model hosting
- Chroma - Vector database for embeddings
- Pandas - Data processing
- Llama 3.2 1B - Local language model
- mxbai-embed-large - Embedding model
- Data Loading - Loads restaurant reviews from CSV
- Embedding Creation - Converts reviews to vector embeddings
- Vector Storage - Stores embeddings in Chroma database
- Query Processing - Converts user questions to embeddings
- Similarity Search - Finds most relevant reviews (top 5)
- Context Generation - Passes relevant reviews to LLM
- Answer Generation - LLM generates answer based on context
local-ai-agent/
├── main.py # Main application
├── vector.py # Vector database setup
├── config.py # Configuration settings
├── requirements.txt # Python dependencies
├── realistic_restaurant_reviews.csv # Sample data
├── chrome_langchain_db/ # Vector database (auto-created)
└── README.md # This file
- LLM Model:
llama3.2:1b(lightweight, fast) - Embedding Model:
mxbai-embed-large(high-quality embeddings) - Retrieval Count: 5 most relevant reviews per question
You can modify all settings in config.py:
# Model Configuration
LLM_MODEL = "llama3.2:1b" # Change LLM model
EMBEDDING_MODEL = "mxbai-embed-large" # Change embedding model
# Vector Database Configuration
RETRIEVAL_COUNT = 5 # Number of reviews to retrieve
# Data Configuration
CSV_FILE = "realistic_restaurant_reviews.csv" # Change data sourceAvailable Models:
- LLM Models:
llama3.2:1b,llama3.2:3b,llama3.1:8b,llama3.1:70b - Embedding Models:
mxbai-embed-large,nomic-embed-text,all-minilm
-
"Model not found" error
- Ensure Ollama is running:
ollama serve - Pull required models:
ollama pull llama3.2:1b
- Ensure Ollama is running:
-
"Chroma database" errors
- Delete the
chrome_langchain_dbfolder and restart - The database will be recreated automatically
- Delete the
-
Slow performance
- The first run is slower (building vector database)
- Subsequent runs are much faster
-
Memory issues
- Try a smaller model:
ollama pull llama3.2:1b - Reduce retrieval count in
vector.py
- Try a smaller model:
- First run: Takes longer to build vector database
- Subsequent runs: Much faster (database is cached)
- Model size: Llama 3.2 1B is optimized for speed
- Retrieval: Limited to 5 reviews for fast responses
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
This project is open source and available under the MIT License.
- Ollama: https://ollama.ai/
- LangChain: https://python.langchain.com/
- Chroma: https://www.trychroma.com/
Built with ❤️ for local AI experimentation