The Business QA Chatbot project provides a conversational interface for business-related queries using a Streamlit application. It leverages advanced NLP and machine learning techniques to deliver accurate responses based on data extracted from PDF documents.
streamlit_app/
: Contains the main Python files and Docker configuration for running the Streamlit app.Dockerfile
: Docker configuration for building and running the application.main.py
: The main Streamlit application script.requirements.txt
: Lists all the required Python packages for the application.
Business_QA_Chatbot.pynb
: Jupyter Notebook for interactive development and testing, recommended to use in Google Colab.streamlit_run/data/
: Directory containing PDF files used to train the model.vectorize_documentation.py
: Script to convert PDF data into a vector database using ChromaDB.
-
Clone the Repository
git clone https://github.com/your-username/business-qa-chatbot.git cd business-qa-chatbot
-
Create a Virtual Environment (Optional but Recommended)
python -m venv venv source venv/bin/activate # On Windows use venv\Scripts\activate
-
Install Dependencies
pip install -r requirements.txt
-
Rename Configuration Files
Rename
sample_config.json
toconfig.json
and replace the placeholder"YOUR_API_KEYS"
with your actual API key. -
Run the Streamlit App
Navigate to the
streamlit_app/
directory and start the Streamlit application:`streamlit run main.py`
-
Build the Docker Image
Navigate to the
streamlit_app/
folder where theDockerfile
is located. Build the Docker image using:docker build -t business-qa-chatbot .
-
Run the Docker Container
After building the image, run the Docker container with:
docker run -p 8080:8080 business-qa-chatbot
-
Access the Application
Open your web browser and navigate to
http://localhost:8080
to access the Streamlit app.
To get the latest Docker image from Docker Hub, use:
docker pull gauravwankhede/business-qa-chatbot:latest
- Navigate to the Directory Containing the Dockerfile
cd path/to/streamlit_app
- Build the Docker Image
docker build -t business-qa-chatbot .
- Run the Docker Container
docker run -p 8080:8080 business-qa-chatbot
All PDF files used for training the model are stored in the streamlit_run/data/
folder. Ensure these files are correctly placed for the application to access.
The vectorize_documentation.py
file converts PDF data into a vector database using:
UnstructuredPDFLoader from langchain_community
for loading PDF documents.RecursiveCharacterTextSplitter from langchain_text_splitters
for splitting text.HuggingFaceEmbeddings from langchain_huggingface
for generating embeddings.Chroma from langchain_chroma
for managing the vector database.
For interactive development and testing, use the Business_QA_Chatbot.pynb
Jupyter Notebook. It is recommended to run this notebook in Google Colab.
For any questions or feedback, please contact pgywww@gmail.com.