Chatbot-Vestibular-Unicamp

Architecture

Document Processing:

PDF Loading: The app reads the file vestibular-data.pdf and extracts its text.
Text Chunking: The extracted text is divided into smaller chunks that can be processed effectively.
Large Language Model: The application utilizes a LLM to generate vector representations (embeddings) of the text chunks.
Embeddings Storage: The generated embeddings are stored in a Vector Database (FAISS).

Query Processing:

Similarity Search: When a user asks a question, that question is transformed into an embedding by the same LLM. This embedding is compared with those stored in the Vector Database, identifying and ranking chunks with the most similar content.
Response Generation: The selected chunks are passed to the LLM, which generates a response based on the content of vestibular-data.pdf.
Conversation Chain: The question and the generated response are stored in a conversation chain structure that allows the model to have a memory and effectively respond to questions related to previous inquiries.

Directory Description

dataset - This folder contains the datasets utilized in the process of creating and testing the model.
docs - This folder contains the application architecture image and the notebook that outlines the method for testing and validating the model.
vectordb - This folder cotains the FAISS Vector Store.
app.py - This file contains the code with the interaction between the model and the ChatBot web application hosted on Streamlit.
data_processing.py - This file consists of functions used to extract and process the content from vestibular-data.pdf.
generate_testset.py - This script automates the process of creating the test set.
model_testing.py - This script automates the process of testing the model.

Dependencies and Installation

To install the Chatbot-Vestibular-Unicamp application, follow these steps:

Clone the repository to your local machine.

git clone https://github.com/vitorpaziam/Chatbot-Vestibular-Unicamp.git

Create a new virtual enviroment with the required dependencies (using Conda).

conda create --name <your enviroment> --file requirements.txt

Activate the virtual enviroment.

conda activate <your enviroment>

Create a .env file in the project directory with your API key from OpenAI.

OPENAI_API_KEY=<your API key>

How to run the ChatBot

To start the Chatbot locally, run the app.py file using the Streamlit CLI. The application will be launched in your default web browser, displaying the user interface.

streamlit run app.py

Model Validation

The process of creating the test set, testing the model, analysis of accuracy, and other discussions are presented in detail at the model-validation notebook.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chatbot-Vestibular-Unicamp

Architecture

Directory Description

Dependencies and Installation

How to run the ChatBot

Model Validation

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
dataset		dataset
docs		docs
vectordb/faiss_index		vectordb/faiss_index
.gitignore		.gitignore
README.md		README.md
app.py		app.py
data_processing.py		data_processing.py
generate_testset.py		generate_testset.py
model_testing.py		model_testing.py
requirements.txt		requirements.txt

vitorpaziam/Chatbot-Vestibular-Unicamp

Folders and files

Latest commit

History

Repository files navigation

Chatbot-Vestibular-Unicamp

Architecture

Directory Description

Dependencies and Installation

How to run the ChatBot

Model Validation

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages