⚠️ X-Talk is in active prototyping. Interfaces and functions are subject to change. We will try to keep interfaces stable.
X-Talk is an open-source full-duplex cascaded spoken dialogue system framework featuring:
- ⚡ Low-Latency, Interruptible, Human-Like Speech Interaction
- Speech flow is optimized to support impressive low latency
- Enables natural user interruption during interaction
- Paralinguistic information (e.g. environment noise, emotion) is encoded in parallel to support in-depth understanding and empathy
- đź§Ş Researcher Friendly
- New models and relevant logic can be added within one Python script, and seamlessly integrated with the default pipeline.
- đź§© Super Lightweight
- The framework backend is pure Python; nothing to build and install beyond
pip install.
- The framework backend is pure Python; nothing to build and install beyond
- 🏠Production Ready
- Concurrency is ensured through asynchronous backend
- Websocket-based implementation empowers deployment from web browsers to edge devices.
- Demo
- Installation
- Quickstart
- Tutorial
- Design Philosophy
- Supported Models
- Contributing
- Acknowledgements
- License
This demo runs on 4090 cluster with 8-bit quantized SenseVoice as speech recognizer, IndexTTS 1.5 as speech generator, and 4-bit quantized Qwen3-30B-A3B as language model. Though at the cost of intelligence due to a relatively small language model, it demonstrates low latency.
tour-guide-en.mp4 |
tour-guide-zh.mp4 |
twenty-questions-en.mp4 |
word-chain-game-zh.mp4 |
web-search-en.mp4 |
web-search-zh.mp4 |
noisy-scene-en.mp4 |
noisy-scene-zh.mp4 |
multi-speaker-en.mp4 |
multi-speaker-zh.mp4 |
The tour guiding demos are conducted with Qwen3-Next-80B-A3B-Instruct as language model, and the other eight demos are aligned with the online demo setting. Larger language models are more intelligent at the cost of latency.
pip install git+https://github.com/xcc-zach/xtalk.git@mainWe will use APIs from AliCloud to demonstrate the basic capability of X-Talk.
First, install dependencies for AliCloud and server script:
pip install "xtalk[ali] @ git+https://github.com/xcc-zach/xtalk.git@main"
pip install jinja2 'uvicorn[standard]'Then, obtain an API key from AliCloud Bailian Platform. We will be using free-tier service from AliCloud.
Online service may be unstable and of high latency. We recommend using locally deployed models for better user experience. See server config tutorial and supported models for details.
After that, create a JSON config specifying the models to use, and fill in <API_KEY> with the key you obtained:
{
"asr": {
"type": "Qwen3ASRFlashRealtime",
"params": {
"api_key": "<API_KEY>"
}
},
"llm_agent": {
"type": "DefaultAgent",
"params": {
"model": {
"api_key": "<API_KEY>",
"model": "qwen-plus-2025-12-01",
"base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1"
}
}
},
"tts": {
"type": "CosyVoice",
"params": {
"api_key": "<API_KEY>"
}
}
}If you find Qwen3ASRFlashRealtime not working properly, you can use
"asr": "SenseVoiceSmallLocal",instead which is a ~1GB local model. Also, you can try to use local speech generation model IndexTTS (setup tutorial):"tts": { "type": "IndexTTS", "params": { "port": 6006 } },If you want all models deployed locally, see here.
The next step is to compose the startup script. Since we also need to link frontend webpage and scripts to get the demo working, the startup script is ready at examples/sample_app/configurable_server.py. We simply need to start the server with the config file (fill in <PATH_TO_CONFIG>.json with the path to the config file we just created) and a custom port:
git clone https://github.com/xcc-zach/xtalk.git
cd xtalk
python examples/sample_app/configurable_server.py --port 7635 --config <PATH_TO_CONFIG>.jsonFinally, our demo is ready at http://localhost:7635. View it in the browser!
Note
See examples/sample_app/configurable_server.py, frontend/src and examples/sample_app/templates for details.
X-Talk has most models and execution on server side, and the client is responsible for interacting with microphone, transmitting audio and Websocket messages, and handle lightweight operations like Voice-Actitvty-Detection.
For client side, you can start with snippet in examples/sample_app/templates/index.html and track where convo is used to see how to use frontend API:
<script type="module">
import { createConversation } from "/static/js/index.js";
const convo = createConversation();
...
</script>The client-side API mainly comes from frontend/src/js/index.js, and if interested, you can check the core code to see how different Websocket messages are handled:
switch (json.action) {
case 'queue_status': {...}
case 'queue_granted': {...}
...
} We plan to improve the client-side API in the near future.
For the server side, the core logic is to connect a X-Talk instance to Websocket of FastAPI instance:
from fastapi import FastAPI, WebSocket
from xtalk import Xtalk
app = FastAPI(title="Xtalk Server")
xtalk_instance = Xtalk.from_config("path/to/config.json")
@app.websocket("/ws")
async def websocket_endpoint(websocket: WebSocket):
await xtalk_instance.connect(websocket)Then you can check examples/sample_app/configurable_server.py for how to mount client-side scripts and pages.
Note
See examples/sample_app/configurable_server.py and frontend/src/js/index.js for details.
X-Talk can understand documents uploaded through embedding search. To enable embedding, you need langchain_openai.OpenAIEmbeddings in the config:
"embeddings": {
"type": "OpenAIEmbeddings",
"params": {
"api_key": "<API_KEY>",
"base_url": "<URL LIKE http://127.0.0.1:8002/v1>",
"model": "<MODEL LIKE Qwen/Qwen3-Embedding-0.6B>"
}
},Then you can fetch text and session_id from client side and notify X-Talk instance through embed_text:
@app.post("/api/upload")
async def upload_file(
session_id: str = Form(...),
file: UploadFile = File(...),
):
# Check file type
content_type = (file.content_type or "").lower()
filename = (file.filename or "").lower()
is_text = content_type.startswith("text/") if content_type else False
if content_type and not is_text:
raise HTTPException(status_code=400, detail="Only text files are supported.")
# Read file content and embed
text = (await file.read()).decode("utf-8", errors="ignore")
await xtalk_instance.embed_text(session_id=session_id, text=text)
return {"status": "ok"}Note that client side should save session_id and send it in the request. Search 'session_info' and uploadFile in frontend/src/js/index.js for how session_id is saved and used.
Note
See examples/sample_app/mental_consultant_server.py for details.
X-Talk supports textual tool customization through add_agent_tools:
xtalk_instance.add_agent_tools([build_mental_questionnaire_tool])Here tool should be a Langchain tool:
from langchain.tools import tool
def search_database(query: str, limit: int = 10) -> str:
"""Search the customer database for records matching the query.
Args:
query: Search terms to look for
limit: Maximum number of results to return
"""
return f"Found {limit} results for '{query}'"In order to maintain seperate states for a tool in echo agent, you can also use a tool factory to maintain internal states (see build_mental_questionnaire_tool examples/sample_app/mental_consultant_server.py)
Note
See source code under src/xtalk/llm_agent/tools for all built-in tools.
Built-in tools include agent-scope ones like web_search and get_time, and pipeline control ones like emotion, timbre and speed of speech. DefaultAgent has built-in tools registered by default.
Note
In order to enable web_search tool, SERPER_API_KEY needs to be set. See SerpAPI.
As mentioned before, X-Talk instance can be created from a JSON config, which customizes models used and controls concurrency behavior.
For model config, config should match model Python class name and init args. For example, the definition of DefaultAgent lies in src/xtalk/llm_agent/default.py:
class DefaultAgent(Agent):
def __init__(
self,
model: BaseChatModel | dict,
system_prompt: str = _BASE_PROMPT,
voice_names: Optional[List[str]] = None,
emotions: Optional[List[str]] = None,
tools: Optional[List[Union[BaseTool, Callable[[], BaseTool]]]] = None,
):
...In order to match with the init args, the config item should look like:
"llm_agent": {
"type": "DefaultAgent",
"params": {
"model": {
"api_key": "none",
"base_url": "http://127.0.0.1:8000/v1",
"model": "cpatonn/Qwen3-30B-A3B-Instruct-2507-AWQ-4bit"
},
"voice_names": [
"Man",
"Woman",
"Child"
],
"emotions": [
"happy",
"angry",
"sad",
"fear",
"disgust",
"depressed",
"surprised",
"calm",
"normal"
]
}
},
Optional keys like voice_names, emotions and tools(not supported in config yet) can be ignored.
See below for full list of model types (slot), their optional dependencies, and their adapting location in source code.
Note
Most model implementations are client-side adaptors. You may need to start the model instance following coresponding instructions.
Also, you can restrict concurrency through:
"max_connections": 1Below is an example config file for X-Talk when you want to have all models hosted locally. SherpaOnnxASR is used for speech recognition, and you can see here to set up the server. For LLM agent and embeddings, any model adhering to OpenAI protocol is fine. You should provide api_key, base_url and model. IndexTTS is used for speech generation, and see here for server setup. Reference voices can be downloaded here. The captioner is hard to set up, but you can refer to the tutorial here. Finally, remember to look into each model type in Supported Models for how to install the optional dependencies of X-Talk for that model.
{
"asr": {
"type": "SherpaOnnxASR",
"params": {
"port": 6006,
"mode": "offline"
}
},
"llm_agent": {
"type": "DefaultAgent",
"params": {
"model": {
"api_key": "none",
"base_url": "http://127.0.0.1:8000/v1",
"model": "cpatonn/Qwen3-30B-A3B-Instruct-2507-AWQ-4bit"
},
"voice_names": [
"Man",
"Woman",
"Child"
],
"emotions": [
"happy",
"angry",
"sad",
"fear",
"disgust",
"depressed",
"surprised",
"calm",
"normal"
]
}
},
"embeddings": {
"type": "OpenAIEmbeddings",
"params": {
"api_key": "none",
"base_url": "http://127.0.0.1:8002/v1",
"model": "Qwen/Qwen3-Embedding-0.6B"
}
},
"tts": {
"type": "IndexTTS",
"params": {
"port": 11996,
"voices": [
{
"name": "Man",
"path": "ReferenceVoice/Man"
},
{
"name": "Woman",
"path": "ReferenceVoice/Woman"
},
{
"name": "Child",
"path": "ReferenceVoice/Child"
}
]
}
},
"speaker_encoder": "PyannoteSpeakerEncoder",
"captioner": {
"type": "Qwen3OmniCaptioner",
"params": {
"base_url": "http://localhost:8901/v1",
"api_key": "none"
}
},
"caption_rewriter": {
"type": "DefaultCaptionRewriter",
"params": {
"model": {
"api_key": "none",
"model": "cpatonn/Qwen3-30B-A3B-Instruct-2507-AWQ-4bit",
"base_url": "http://127.0.0.1:8000/v1"
}
}
},
"thought_rewriter": {
"type": "DefaultThoughtRewriter",
"params": {
"model": {
"api_key": "none",
"model": "cpatonn/Qwen3-30B-A3B-Instruct-2507-AWQ-4bit",
"base_url": "http://127.0.0.1:8000/v1"
}
}
},
"speech_speed_controller": "RubberbandSpeedController"
}Note
See examples/sample_app/custom_model.py and examples/sample_app/echo_agent.py for details.
Note
See Recipe for adding a model of existing types.
You may want to introduce a new model of an existing type (e.g. text-to-speech), or add a model of new type (e.g. a model that handles backchannel). This can be achieved by register_model_search_spec before a xtalk_instance is created from config:
from xtalk import Xtalk
Xtalk.register_model_search_spec(
slot="llm_agent",
spec=Path(__file__).parent / "echo_agent.py",
)
xtalk_instance = Xtalk.from_config(args.config)Here slot matches the name of corresponding init arg in Pipeline. You can check Xtalk.MODEL_REGISTRY for existing slots, or use a new slot to represent a new type of models (see examples\sample_app\custom_service.py and there llm_output_refactor_model can be the new slot).
spec is the path to model implementation, an example implementation in echo_agent.py looks like this:
from xtalk.model_types import Agent
class EchoAgent(Agent):
"""A simple agent that echoes user input."""
def generate(self, input) -> str:
if isinstance(input, dict):
return input["content"]
return input
def clone(self) -> "EchoAgent":
return EchoAgent()Then you can use the custom model in config file:
{
"asr": {
"type": "Qwen3ASRFlashRealtime",
"params": {
"api_key": "<API_KEY>"
}
},
"llm_agent": "EchoAgent",
"tts": {
"type": "CosyVoice",
"params": {
"api_key": "<API_KEY>"
}
}
}Recipes for major model customization are listed below. You can read source code for interfaces of other model types. We will update these interfaces from time to time.
Note
See src/xtalk/model_types.py for all available model types.
Important
X-Talk has asynchronous default implementations for sync versions, which usually with run_in_executor, like async_recognize for recognize w.r.t. ASR. However, in order to achieve best concurrency for production, we recommend to implement these async versions by your self.
Your ASR class must inherit from xtalk.speech.interfaces.ASR and implement the following methods:
recognize(audio: bytes) -> str- Recognize audio in a single pass.
reset() -> None- Reset internal recognition state.
clone() -> ASR- Return a new instance for use in new or concurrent sessions.
- Sharing weights/connections (e.g.,
_shared_model) is allowed, but you can't share states.
Methods below are optional:
recognize_stream(audio: bytes, *, is_final: bool = False) -> str- Interface for streaming incremental recognition.
- Returns the "current cumulative recognition result up to this point".
async_recognize(audio: bytes)async def async_recognize_stream( self, audio: bytes, *, is_final: bool = False )
Important
Input for recognize and recognize_stream is PCM 16-bit mono 16 kHz raw bytes. You may need to do conversion by yourself.
Note
X-Talk have default implementation for recognize_stream with a MockStreamRecognizer. Therefore, no worry for your non-streaming ASR models.
Note
You can refer to existing implementations (e.g., src/xtalk/speech/asr/zipformer_local.py) when building your own ASR class. We recommend deploying ASR as a separate service and invoking it via API calls within the ASR class, referencing the implementation of src/xtalk/speech/asr/sherpa_onnx_asr.py.
Your new TTS class must inherit from xtalk.speech.interfaces.TTS and implement the following methods:
-
synthesize(self, text: str) -> bytes- Input: The text to synthesize.
- Output: Raw audio bytes in PCM 16-bit, mono, 48000 Hz.
-
clone(self) -> TTS- Return a new TTS instance:
- It should have isolated runtime state to avoid cross-session interference and it may share read-only resources if your backend supports that.
- Return a new TTS instance:
Optional methods
-
synthesize_stream(self, text: str, **kwargs) -> Iterable[bytes]- If your backend supports streaming synthesis, you can override this method.
-
set_voice(self, voice_names: list[str])- This method works with the
TTSVoiceChangeevent inTTSManagerto switch voices via language model tool calls. - Usually there is only one element in
voice_names, and this is the current behavior for tool call result. However, some TTS models may support mixing multiple voices for reference. Therefore,voice_namesis list type.
- This method works with the
-
set_emotion(self, emotion: str | list[float])- This method works with the
TTSEmotionChangeevent inTTSManagerto switch emotions via language model tool calls. - Current tool call result only carries
emotionasstr. However, you may also wantlist[float]as emotion vector for future use.
- This method works with the
-
async def async_synthesize(self, text: str, **kwargs: Any) -
async def async_synthesize_stream( self, text: str, **kwargs: Any )
Note
See examples/sample_app/custom_service.py for details. A dummy LLMOutputRefactorModel is added to X-Talk to prepend Assistant response: before model response text.
If you want to add new functionality, you can follow the procesures below:
First, you may want to define a new model. Here is a model that prepend some text before input:
# Define a custom model
class LLMOutputRefactorModel:
def refactor(self, llm_output: str) -> str:
# Custom logic to refactor LLM output
return "Assistant response: " + llm_output
# If custom model has internal state, implement clone method with concrete state
def clone(self):
return LLMOutputRefactorModel()Note that clone is neccesary when your model has internal state that should be distinct across user sessions, like the recognition cache of a streaming speech recognition model.
If you define a new model, or want to add some new function to Pipeline, the second step is to define a custom Pipeline:
@dataclass(init=False)
class CustomPipeline(DefaultPipeline):
llm_output_refactor_model: Optional["LLMOutputRefactorModel"] = field(
default=None,
metadata={"init_key": "llm_output_refactor_model", "clone": True},
)
def __init__(
self,
asr: ASR,
llm_agent: Agent,
tts: TTS,
captioner: Optional[Captioner] = None,
punt_restorer_model: Optional[PuntRestorer] = None,
caption_rewriter: Optional[Rewriter | BaseChatModel] = None,
thought_rewriter: Optional[Rewriter | BaseChatModel] = None,
vad: Optional[VAD] = None,
speech_enhancer: Optional[SpeechEnhancer] = None,
speaker_encoder: Optional[SpeakerEncoder] = None,
speech_speed_controller: Optional[SpeechSpeedController] = None,
embeddings: Optional[Embeddings] = None,
llm_output_refactor_model: Optional["LLMOutputRefactorModel"] = None,
**kwargs,
):
super().__init__(
asr=asr,
llm_agent=llm_agent,
tts=tts,
captioner=captioner,
punt_restorer_model=punt_restorer_model,
caption_rewriter=caption_rewriter,
thought_rewriter=thought_rewriter,
vad=vad,
speech_enhancer=speech_enhancer,
speaker_encoder=speaker_encoder,
speech_speed_controller=speech_speed_controller,
embeddings=embeddings,
**kwargs,
)
self.llm_output_refactor_model = llm_output_refactor_model
def get_llm_output_refactor_model(
self,
) -> Optional["LLMOutputRefactorModel"]:
return self.llm_output_refactor_modelNote that **kwargs is necessary in __init__ to swallow shadowed parameters from DefaultPipeline. And if you add a new arg to __init__, you will need to register it as a field, specifying its clone behavior (True/False)
Based on X-Talk’s event-bus mechanism, then you can add a new Manager to subscribe to an existing Event and implement the custom functionality you need. Meanwhile, you can create a new Event if needed.
For Example:
LLMOutputRefactoredFinal = create_event_class(
name="LLMOutputRefactoredFinal", fields={"text": "", "turn_id": 0} # key: default_value
)
class LLMOutputRefactorManager(Manager):
def __init__(
self,
event_bus: EventBus,
session_id: str,
pipeline: Pipeline,
config: dict[str, Any],
):
self.event_bus = event_bus
self.pipeline = pipeline
@Manager.event_handler(LLMAgentResponseFinish)
async def handle_llm_response_finish(self, event: LLMAgentResponseFinish):
refactor_model = self.pipeline.get_llm_output_refactor_model()
if refactor_model:
refactored_output = refactor_model.refactor(event.text)
new_event = LLMOutputRefactoredFinal(
session_id=event.session_id,
text=refactored_output,
turn_id=event.turn_id,
)
await self.event_bus.publish(new_event)
async def shutdown(self):
pass
custom_service = DefaultService(pipeline=pipeline)
custom_service.register_manager(LLMOutputRefactorManager)Then you can optionally use unsubscribe_event and subscribe_event to switch other components (such as OutputGateway) from subscribing the old event to the new event. Meanwhile, for the new event, you need to implement the handling method.
custom_service.unsubscribe_event(
event_listener_cls=OutputGateway, event_type=LLMAgentResponseFinish
)
async def output_gateway_llm_output_refactored_final_handler(
self: OutputGateway,
event,
):
await self.send_signal(
{
"action": "finish_resp", # you can find "finish_resp" in frontend/src/js/index.js
"data": {"text": event.text, "turn_id": event.turn_id},
}
)
custom_service.subscribe_event(
event_listener_cls=OutputGateway,
event_type=LLMOutputRefactoredFinal,
method_or_handler=output_gateway_llm_output_refactored_final_handler,
)
Prospective Data Flow of X-Talk
X-Talk follows a modular, stage-wise functional flow, progressing from noisy speech input, through frontend speech interaction, speech understanding, and an LLM-driven conversational agent, to speech generation. This logical pipeline is realized through a layered, event-driven, and loosely-coupled architecture, which forms the core of the system.
This design systematically addresses the key challenges of real-time speech-to-speech dialogue systems:
- Controlling sub-second end-to-end latency
- Orchestrating multiple heterogeneous components
- Enabling flexible integration and swapping of backend models and services
The entire system is built around a centralized event bus. All layers communicate asynchronously through event publishing and subscribing, enabling efficient management of complex conversational state and data flow.
The Frontend Layer serves as the user-facing interface and directly handles browser-based interaction. It is responsible for:
- Rendering the conversational user interface
- Performing client-side Voice Activity Detection (VAD)
- Applying audio denoising and enhancement
- Displaying real-time latency metrics to the user
This layer packages audio streams, VAD markers, and contextual information for transmission to the backend.
The Event Center Layer acts as the system’s communication hub and network boundary, unifying event routing and protocol translation. It consists of two tightly integrated components:
-
Gateways
- The Input Gateway converts frontend streams into typed internal events
- The Output Gateway delivers processed events back to the frontend
-
Event Bus
- Provides the asynchronous messaging fabric
- Routes events between all components in the system
Together, these components decouple all other layers by handling protocol adaptation, event distribution, and lifecycle isolation, forming the extensible backbone of the architecture.
The Managers Layer orchestrates the core conversational workflow through specialized, capability-specific managers. Each manager:
- Subscribes to relevant events
- Executes its dedicated logic (e.g., ASR, LLM inference, TTS)
- Publishes new events to drive the dialogue forward
This event-driven orchestration enables fine-grained control over execution order, concurrency, and latency.
The Agents Layer functions as the system’s task-planning and execution engine. It integrates structured inputs from upstream models—such as ASR outputs, voice captions, and contextual signals—into a coherent speech understanding.
Based on this understanding, the agent orchestrates tool usage, including:
- Web search
- Local retrieval
- Audio control
- External API calls
Finally, it synthesizes retrieved or processed information into a context-aware natural language response.
The Models Layer provides a unified, interface-driven abstraction for core speech-to-speech dialogue capabilities, including:
- Speech understanding
- LLM-based conversational agents
- Speech generation
By defining stable and modular contracts for each capability, this layer allows compliant implementations to be seamlessly integrated, swapped, or scaled without impacting other system components.
Slot: asr
SherpaOnnx is recommended for its wide support of models and optimized inference performance.
SherpaOnnx
Dependency: pip install "xtalk[sherpa-onnx-asr] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/asr/sherpa_onnx_asr.py
A high-performance speech recognition framework and beyond.
Qwen3ASRFlashRealtime
Dependency: pip install "xtalk[ali] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/asr/qwen3_asr_flash_realtime.py
Zipformer
Dependency: pip install "xtalk[zipformer-local] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/asr/zipformer_local.py
ElevenLabs
Dependency: pip install "xtalk[elevenlabs] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/asr/elevenlabs.py
Slot: tts
IndexTTS
Dependency: pip install "xtalk[index-tts] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path:
src/xtalk/speech/tts/index_tts.pysrc/xtalk/speech/tts/index_tts2.py
GPT-SoVITS
Dependency: pip install "xtalk[gpt-sovits] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/tts/gpt_sovits.py
CosyVoice
Dependency: pip install "xtalk[ali] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/tts/cosyvoice.py
ElevenLabs
Dependency: pip install "xtalk[elevenlabs] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/tts/elevenlabs.py
Slot: vad
X-Talk has VAD on client side, so you may not need one.
Silero VAD
Dependency: pip install "xtalk[silero-vad] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/vad/silero_vad.py
Slot: speech_enhancer
FastEnhancer
Dependency: pip install onnxruntime
Path: src/xtalk/speech/speech_enhancer/speech_enhancer.py
Slot: speaker_encoder
Wespeaker-Voxceleb-Resnet34-LM
Dependency: pip install "xtalk[pyannote] @ git+https://github.com/xcc-zach/xtalk.git@main"
Path: src/xtalk/speech/speaker_encoder/pyannote_embedding.py
Slot: captioner
Captioners give you description of audio clip.
Qwen3-Omni-30B-A3B-Captioner
Dependency: None
Path: src/xtalk/speech/captioner/qwen3_omni_captioner.py
We express sincere gratitude for:
- Langchain as backbone of LLM agents
- vllm for deployment of most models
- All model providers mentioned in Supported Models
All of you provide the solid foundation of X-Talk!
This project is licensed under the Apache License 2.0, if you do not install optional dependencies. Some optional dependencies may be under incompatible licenses.
