Skip to content

singultek/document-chat-service

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Document Chat Service

This project is a sophisticated, multi-functional chat service designed to interact with your documents. It leverages a modular architecture to provide context-aware answers using Retrieval-Augmented Generation (RAG), and can use external tools like web search and a calculator to enhance its capabilities. It supports a wide variety of Large Language Model (LLM) backends and is built with a focus on robustness and modularity.

Development Status

Note: This project is under active development. The API server implementation and Docker containerization are currently works in progress and not yet production-ready.

Features

  • Retrieval-Augmented Generation (RAG): Implemented as a core tool, it ingests documents to provide answers based on their content.
  • Multi-LLM Support: A pluggable architecture allows for using various LLM providers, including OpenAI, Google AI, Hugging Face, Together AI, and a local Llama.cpp client, configured via config.yaml.
  • Tool Integration: Extensible agent-like capabilities with tools for document retrieval, web search, and calculations.
  • Robust Engineering: Includes production-ready features like a circuit breaker, error recovery, and retry mechanisms for external API calls.
  • Flexible Configuration: Easily configure the application, including the LLM client, model parameters, and tools, via a central config.yaml file managed with Pydantic.
  • Containerization: (In Progress) Dockerfiles are provided but are currently under development.
  • API Server: (In Progress) A FastAPI application is being developed but is not yet complete.

Architecture and Technologies

The service is built using a modern Python stack, centered around a modular architecture that facilitates Retrieval-Augmented Generation and tool use.

  • API Layer: (In Development) The service will include a FastAPI application (documentchatter/controller.py) to provide a robust and high-performance asynchronous API for handling user chat requests.

  • RAG Pipeline: The RAG capability is implemented through a dedicated retrieval_tool and vector_store manager:

    1. Document Loading & Processing: Documents are loaded, split into chunks, and converted into numerical vector embeddings.
    2. Vector Storage: The embeddings and their corresponding text chunks are stored and managed by the vector_store/manager.py, which supports vector databases like ChromaDB or FAISS.
    3. Retrieval as a Tool: When a user asks a question, the retrieval_tool is invoked. It embeds the user's query and searches the vector store to find the most relevant text chunks from the documents.
  • LLM and Agent Orchestration:

    • An orchestration layer within the controller manages the interaction between the user query, the available tools, and the selected LLM.
    • It uses an agent-like framework to decide whether to answer directly, use the retrieval_tool to get document context, or use other tools like the web_search_tool or calculator_tool.
    • The Multi-LLM support is achieved through an abstraction layer (documentchatter/base/base_llm_client.py) with specific implementations for each provider in documentchatter/llm_clients/.
  • Resilience:

    • The documentchatter/utils/ module contains robust engineering components.
    • Retries: Network requests to external services are wrapped in a retry mechanism (retry_config.py) to handle transient failures.
    • Circuit Breaker: A circuit breaker pattern (circuit_breaker.py) prevents the application from repeatedly calling a failing service, improving system stability.

Project Structure

The project is organized within the documentchatter Python package for modularity and scalability.

document-chat-service/
├── documentchatter/          # Main application source code
│   ├── controller.py         # FastAPI application entry point (in development)
│   ├── bin/main_v2.py        # Application runner script
│   ├── base/                 # Abstract base classes for core components
│   ├── config/               # Configuration loading (pydantic_config.py)
│   ├── llm_clients/          # Implementations for different LLM providers
│   ├── tools/                # Implementations for tools (retrieval, web search, etc.)
│   ├── utils/                # Resilience and utility modules (circuit breaker, retry)
│   ├── vector_store/         # Vector database management for RAG
│   ├── prompt_templates/     # Prompt templates for the LLM
│   ├── Dockerfile            # Docker configuration (in development)
│   └── requirements.txt      # Python dependencies
├── documents/                # Directory for storing your documents (needs to be created)
└── README.md                 # This file

Prerequisites

  • Python 3.11+
  • Docker (optional, for containerized deployment - not yet fully implemented)
  • API keys for the LLM providers and tools you plan to use.

Environment Setup

  1. Clone the repository:

    git clone https://github.com/singultek/document-chat-service.git
    cd document-chat-service
  2. Install the required Python packages:

    pip install -r documentchatter/requirements.txt
  3. Create a documents directory in the project root and place your files (e.g., PDF, TXT) inside it.

    mkdir documents

Configuration

The application is configured through the config.yaml file in the project root. Here you can set:

  • The LLM provider (openai, google, huggingface, together, local)
  • Model parameters (e.g., temperature, max_tokens)
  • Tool settings (e.g., enable/disable web search)
  • Vector store and embedding model configurations

Running the Application

Note: The API server functionality is still under development. The current implementation provides basic capabilities but is not yet feature-complete.

Run the application by executing the main script from the project's root directory:

python documentchatter/bin/main.py

The application will process documents in the documents directory and initialize the necessary components based on your configuration.

To stop the application, press CTRL+C in your terminal.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages