Build real-time voice AI agents powered by LiveKit Agent, Small Language Models (SLMs), and WebRTC.
This project is a quickstart template to run locally or with 3rd party integrations. It showcases how to combine WebRTC, LiveKit’s Agent framework, and open-source tools like Whisper and Llama to prototype low-latency voice assistants for real-time applications.
- 🌐 WebRTC + LiveKit: Real-time media transport with WebRTC powered by LiveKit.
- 🤖 LiveKit Agent: Modular plugin-based framework for voice AI agents.
- 🗣️ STT + TTS Support: Plug in Whisper, Deepgram, ElevenLabs, or others.
- 💬 LLM Integration: Use local LLaMA models or connect to AWS/ OpenAI / Anthropic APIs.
- 🧪 Local Dev: Run everything locally with Docker Compose or Python virtual env.
THERE ARE 2 IMPLEMENTATIONS OF THE AI AGENT:
- fast-preresponse.py using 3rd party services and the complete metrics capture in place.
- fast-preresponse-ollama.py which is only using open source souftware and can run locally without internet.
Just update Dockerfile to use one or another. More info here.
- Clone:
# Clone the repo
git clone https://github.com/agonza1/webrtc-agent-livekit.git
cd webrtc-agent-livekit
-
Install dependencies docker and docker compose
-
If you want to also run the example frontend, copy and rename the
.env.example
file to.env.local
and fill in the necessary environment variables. You can also update the YML files to configure the different services. For example, agents-playground.yml:
LIVEKIT_API_KEY=<your API KEY> #change it in livekit.yaml
LIVEKIT_API_SECRET=<Your API Secret> #change it in livekit.yaml
NEXT_PUBLIC_LIVEKIT_URL=ws://localhost:7880 #wss://<Your Cloud URL>
- Run docker-compose:
docker compose up --build
Make sure that at least the services "agent-playground", "agent-worker", "livekit" and "redis" in the docker-compose are uncommented and the envs are updated.
-
Open http://localhost:3000 with your browser to see the result.
-
Connect to a room
The solution is build using Prometheus and grafana. The end to end flow is: Agent worker → writes metrics to shared temp folder → agent_metrics exposes them → Prometheus scrapes them → Grafana displays them.
Agent Worker live metrics are exposed on port 9100 and can be accessed at:
http://localhost:9100/metrics
Grafana is available in http://localhost:3001 with default user/password: admin/admin A default dashboard is setup to visualize basic real time voice agents information.
This project is built on top of amazing open-source tools and services:
- LiveKit and LiveKit Agents - WebRTC Framework for building voice AI agents
- Ollama - Local LLM inference engine
- Llama - Open-source large language models by Meta
- Kokoro TTS - Open-source text-to-speech model
- Prometheus and Grafana - Metrics collection, monitoring and visualization