Skip to content

Latest commit

 

History

History

Q&A-and-RAG-with-SQL-and-TabularData

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Q&A-and-RAG-with-SQL-and-TabularData

Q&A-and-RAG-with-SQL-and-TabularData is a chatbot project that utilizes GPT 3.5, Langchain, SQLite, and ChromaDB and allows users to interact (perform Q&A and RAG) with SQL databases, CSV, and XLSX files using natural language.

Key NOTE: Remember to NOT use a SQL databbases with WRITE privileges. Use only READ and limit the scope. Otherwise your user could manupulate the data (e.g ask your chain to delete data).

Features:

  • Chat with SQL data.
  • Chat with preprocessed CSV and XLSX data.
  • Chat with uploaded CSV and XSLX files during the interaction with the user interface.
  • RAG with Tabular datasets.

YouTube video: TBD

Main underlying techniques used in this chatbot:

  • LLM chains and agents
  • GPT function calling
  • Retrieval Augmented generation (RAG)

Models used in this chatbot:

Requirements:

  • Operating System: Linux OS or Windows. (I am running the project on Linux WSL for windows)
  • OpenAI or Azure OpenAI Credentials: Required for GPT functionality.

Installation:

  • Ensure you have Python installed along with required dependencies.
sudo apt update && sudo apt upgrade
python3 -m venv sql-raggpt-env
git clone <the repository>
cd SQL-RAG-GPT
source ...Path to the environment/sql-raggpt-env/bin/activate
pip install -r requirements.txt

Execution:

  1. To prepare the SQL DB from a .sql file, Copy the file into data/sql directory and in the terminal, from the project folder, execute:
sudo apt install sqlite3

Now create a database called sqldb:

sqlite3 data/sqldb.db
.read data/sql/<name of your sql database>.sql

Ex:

.read data/sql/Chinook_Sqlite.sql

This command will create a SQL database named sqldb.db in the data directory. Verify that it created the database

SELECT * FROM <any Table name in your sql database> LIMIT 10;

Ex:

SELECT * FROM Artist LIMIT 10;
  1. To prepare a SQL DB from your CSV and XLSX files, copy your files in data/csv_xlsx and in the terminal, from the project folder, execute:
python src/prepare_csv_xlsx_db.py.

This command will create a SQL database named csv_xlsx_sqldb.db in the data directory.

  1. To upload your datasets and chat with them during the interaction with the user interface:
  • Change the chat functioncality to Process files
  • Upload you files and wait for the message indicating the the database is ready.
  • Switch back the chat functioncality to Chat
  • Change the RAG with dropdown to Uploaded files.
  • Start chatting.

Project Schema

Schema

Chatbot User Interface

ChatBot UI

Databases:

  • Diabetes dataset: Link
  • Cancer dataset: Link
  • Chinook database: Link

Key frameworks/libraries used in this chatbot: