TalkingRagBot is an AI-powered assistant designed to offer interactive conversations with features like text-to-speech, database memory, and retrieval-augmented generation. This application combines real-time messaging with deep contextual understanding, making it a versatile AI companion.
- Overview
- Features
- Diagrams
- Quick Start
- Setup and Installation
- Commands
- Usage Examples
- Project Motivation
- Technologies Used
- Contribution
- License
This structure will make it easy for users to navigate through the README and find the information they need.
- Text-to-Speech (TTS) with Read Aloud: Integrated with the Read Aloud widget for text-to-speech functionality. Users can select a Piper AI voice from Read Aloud's options or add a custom Piper voice for more personalization.
- Retrieval-Augmented Generation (RAG): Uses a Chroma vector database for recalling relevant context from past conversations.
- Custom Memory Commands: Stores personalized interactions for future reference.
Fig 1. Talking Rag Bot Overview
- Clone the repository.
- Install dependencies for frontend and backend.
- Set up the database.
- Start the application.
- Node.js and npm
- Python (version 3.12)
- PostgreSQL
- Ollama
- Rust (required for some dependencies)
- Read Aloud Extension (optional): Install the Read Aloud widget (available as a Chrome extension) for TTS functionality
- Piper TTS (optional): Piper integration in Read Aloud allows users to choose or add custom voices
- Clone the repository:
git clone https://github.com/xrgpublic/TalkingRagBot.git
cd TalkingRagBot
- Install frontend dependencies:
cd client
npm install
- Install backend dependencies:
cd ..
cd server
npm install
- Install AI dependencies:
cd ..
cd LocalAI
pip install -r requirements.txt
When running pip install -r requirements.txt
, you might encounter a wheel installation failure. If this occurs, please run the following commands to resolve the issue:
pip install --upgrade setuptools wheel
pip install playsound
- Start PostgreSQL Environment: To start the PostgreSQL environment, run the following command in your command line:
psql -U postgres
- Run the following commands in your PostgreSQL environment to set up the required database and tables:
CREATE USER mruser WITH PASSWORD 'isSuperCool' SUPERUSER;
CREATE DATABASE memory_agent;
GRANT ALL PRIVILEGES ON SCHEMA public TO mruser;
GRANT ALL PRIVILEGES ON DATABASE memory_agent TO mruser;
\c memory_agent
CREATE TABLE conversations(
id SERIAL PRIMARY KEY,
timestamp TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
prompt TEXT NOT NULL,
response TEXT NOT NULL
);
INSERT INTO conversations (timestamp, prompt, response) VALUES (CURRENT_TIMESTAMP, 'What is my name?', 'Your name is Mr User. Known online as [Redacted].');
- In the main directory
# Run bat file
webui.bat
- Install Read Aloud: Add the Read Aloud Chrome extension to your browser for TTS functionality.
- Enable Piper Voices: Within Read Aloud’s settings, choose from existing Piper AI voices or add a custom voice by following the extension's guidelines for Piper integration.
- Select Your Voice: In Read Aloud, configure your TTS settings to select your preferred Piper voice. Once configured, the chatbot will use the chosen voice for speech output when reading text aloud.
Using Read Aloud with Piper integration allows flexibility in voice options, enabling a more personalized assistant experience.
The bot supports several commands for managing memory and performing specific tasks: Note: The commands output only shows up in the CLI.
/read
: Reads the content of a specified file directly from the command line./recall <text>
: Searches memory for responses related to the provided user prompt./launch <website>
: Launches a website related to the provided prompt./forget
: Removes the last conversation stored in memory./switchdb
: Switches the database context topublic_database
./memorize <text>
: Stores the provided text as a memory./creatingFile <filename>
: Creates a new file with the specified name.
User: "/recall What's my name?"
Bot: "Your name is Mr. User, known online as [Redacted]."
User: "/launch 'Can you open Dominos pizza in my browser?'"
Bot: Opens Dominos website.
This project was a learning exercise, with a focus on functional exploration over production ready code. This was my first AI project and first React project. My goals were to get a strong understanding of how to build a fullstack React application, learn how LLMs work, and learn what tools an AI will need to properly interact with the average person(when compared to an "AI power user"). This is why you will see sprinkles of a bunch of different tools, but nothing completely implemented. I just wanted proofs of concepts that I can use as a foundation for when I make real consumer ready products.
- Frontend: React, CSS for styling
- Backend: Node.js, Python
- Database: PostgreSQL for conversation storage and memory management
- Vector Database: ChromaDB for managing embeddings
- Real-Time Communication: Socket.io for real-time message exchange
- Text-to-Speech: Read Aloud widget with Piper integration for custom voices
- AI Model: Ollama with LLaMA and heremes 3 8B as base models for conversation and nomic-embed-text for RAG
Contributions are welcome! Please fork this repository and submit a pull request if you have suggestions or improvements.
This project is licensed under the MIT License.