ZEXLLM: The All-in-One AI Application You've Been Looking For
Chat with your documents, use AI agents, highly configurable, multi-user, with no complicated setup required.
This is a full-stack application that can transform documents, resources (such as web links, audio, video), or content fragments into context, so that any large language model (LLM) can reference them during chats. The application allows users to select a specific LLM or vector database to use, and it supports multi-user management with customizable permissions.
ZEXLLM is a full-stack application where you can leverage commercial LLMs or popular open-source LLMs in combination with vector database solutions to build a fully private ChatGPT, free from external restrictions. You can choose to run it locally or host it remotely and interact with any document you provide.
ZEXLLM organizes your documents into "workspaces." Workspaces are similar to threads but with enhanced containerization of documents, supporting document sharing. The content within each workspace is kept separate to ensure a clear context without interference or contamination from other workspaces.
- 👤Supports multi-user instances and permission management.
- 🦾AI agents within workspaces can perform tasks such as browsing websites and running code.
- 🖼️ Customizable chat window for embedding on your website.
- Supports various document formats (e.g., PDF, TXT, DOCX).
- 🆕 Simple user interface for managing documents within the vector database.
- Two conversation modes: Chat mode (retains conversation history) and Query mode (for simple Q&A).
- Automatically provides relevant document content references during chats.
- Fully cloud-deployment ready.
- 💬 Offers the option to "deploy your own LLM model."
- 📖 Efficiently manages large documents with low resource consumption. Embedding large documents only once saves 90% of costs compared to other document chat solutions.
- Provides a full set of developer APIs for custom integrations.
ZEXLLM is an efficient, flexible, and powerful document and conversation management platform, ideal for businesses and developers in a wide range of applications.
Supported Large Language Models (LLMs):
- Any open-source llama.cpp compatible model
- OpenAI
- OpenAI (Generic)
- Azure OpenAI
- AWS Bedrock
- Anthropic
- NVIDIA NIM (chat models)
- Google Gemini Pro
- Hugging Face (chat models)
- Ollama (chat models)
- LM Studio (all models)
- LocalAi (all models)
- Together AI (chat models)
- Fireworks AI (chat models)
- Perplexity (chat models)
- OpenRouter (chat models)
- DeepSeek (chat models)
- Mistral
- Groq
- Cohere
- KoboldCPP
- LiteLLM
- Text Generation Web UI
- Apipie
- xAI
- Novita AI (chat models)
Embedder models:
- ZEXLLM Native Embedder (default)
- OpenAI
- Azure OpenAI
- LocalAi (all)
- Ollama (all)
- LM Studio (all)
- Cohere
Audio Transcription models:
- ZEXLLM Built-in (default)
- OpenAI
TTS (text-to-speech) support:
- Native Browser Built-in (default)
- PiperTTSLocal - runs in browser
- OpenAI TTS
- ElevenLabs
- Any OpenAI Compatible TTS service.
STT (speech-to-text) support:
- Native Browser Built-in (default)
Vector Databases:
This monorepo consists of three main sections:
frontend
: A viteJS + React frontend that you can run to easily create and manage all your content the LLM can use.server
: A NodeJS express server to handle all the interactions and do all the vectorDB management and LLM interactions.collector
: NodeJS express server that process and parses documents from the UI.docker
: Docker instructions and build process + information for building from source.embed
: Submodule for generation & creation of the web embed widget.browser-extension
: Submodule for the chrome browser extension.
yarn setup
To fill in the required.env
files you'll need in each of the application sections (from root of repo).- Go fill those out before proceeding. Ensure
server/.env.development
is filled or else things won't work right.
- Go fill those out before proceeding. Ensure
yarn dev:server
To boot the server locally (from root of repo).yarn dev:frontend
To boot the frontend locally (from root of repo).yarn dev:collector
To then run the document collector (from root of repo).
These are apps that are not maintained by ZexAiLabs, but are compatible with zexLLM. A listing here is not an endorsement.
- Midori AI Subsystem Manager - A streamlined and efficient way to deploy AI systems using Docker container technology.
- Coolify - Deploy ZEXLLM with a single click.
- GPTLocalhost for Microsoft Word - A local Word Add-in for you to use ZEXLLM in Microsoft Word.
We use this information to help us understand how ZEXLLM is used, to help us prioritize work on new features and bug fixes, and to help us improve zexLLM's performance and stability.
Set DISABLE_TELEMETRY
in your server or docker .env settings to "true" to opt out of telemetry. You can also do this in-app by going to the sidebar > Privacy
and disabling telemetry.
We will only track usage details that help us make product and roadmap decisions, specifically:
- Type of your installation (Docker or Desktop)
- When a document is added or removed. No information about the document. Just that the event occurred. This gives us an idea of use.
- Type of vector database in use. Let's us know which vector database provider is the most used to prioritize changes when updates arrive for that provider.
- Type of LLM in use. Let's us know the most popular choice and prioritize changes when updates arrive for that provider.
- Chat is sent. This is the most regular "event" and gives us an idea of the daily-activity of this project across all installations. Again, only the event is sent - we have no information on the nature or content of the chat itself.
You can verify these claims by finding all locations Telemetry.sendTelemetry
is called. Additionally these events are written to the output log so you can also see the specific data which was sent - if enabled. No IP or other identifying information is collected. The Telemetry provider is PostHog - an open-source telemetry collection service.
- create issue
- create PR with branch name format of
<issue number>-<short name>
- LGTM from core-team
Copyright © 2025 [ZexAi Labs][profile-link].
This project is MIT licensed.