The deloitte-insightbot is a question and answer system designed to provide insights based on Deloitte's weekly
economic updates. These updates offer a brief overview of the global political and economic situation, summarizing key
impacts and trends.
- Data Ingestion: Fetches content from Deloitte's weekly economic update URL.
- Embeddings Storage: Stores embeddings of the content in a VectorDB.
- Retrieval-Augmented Generation: Retrieves relevant passages to generate answers for user queries using an LLM.
- Data Ingestion: A module to scrape and parse content from the specified URL.
UnstructuredURLLoaderclass to fetch and parse the content from the URL.
- Embeddings Model: Utilizes an embedding model to convert content into vector representations.
OpenAIEmbeddingsmodel with the model nametext-embedding-3-large.
- VectorDB: Stores the embeddings for efficient retrieval.
Chromaclass from langchain_chroma is used to interact with ChromaDB.
- LLM: Generates answers based on the retrieved passages.
- ChatOpenAI class with the model name
gpt-3.5-turbo.
- ChatOpenAI class with the model name
- Ingest Data: Run the data ingestion script to fetch and parse the content.
- Store Embeddings: Use the embeddings model to convert the content into vectors and store them in the VectorDB.
- Query System: Input a user query to retrieve relevant passages and generate an answer using the LLM.
Install the required packages
pip install -r requirements.txtStart the ChromaDB container
docker compose up -dPing the ChromaDB container to check if it is running
curl localhost:8000/api/v1/heartbeatRun the application
python src/main.py