Ultra-low latency voice AI conversations in your browser - ~50% faster than ElevenLabs.
RunAnywhere is building the future of local AI inference, making powerful AI models run efficiently on any device. This voice pipeline demonstrates our commitment to high-performance, privacy-preserving AI that runs directly in your browser.
Check out our SDKs and tools for more ways to run AI locally.
A complete end-to-end voice AI pipeline with ultra-fast response times:
- Moonshine STT → OpenAI LLM → Kokoro TTS
- Fully local speech recognition and synthesis
- ~50% faster than ElevenLabs cloud pipeline
- Side-by-side comparison mode for benchmarking
- Comprehensive performance metrics at every stage
- Ultra-low latency conversational AI
- WebGPU/WebAssembly acceleration
- Voice Activity Detection with echo prevention
- 16 voice options (Kokoro) or native browser TTS
- Progressive Web App (works offline)
Built on top of amazing open source projects:
- Original Whisper Web by Xenova
- Enhanced fork by Pierre Mesure
- Transformers.js for WebAssembly ML
- Moonshine STT models
- Kokoro TTS by hexgrad
For technical details, see Voice Pipeline Architecture.
-
Clone the repo and install dependencies:
git clone https://github.com/PierreMesure/whisper-web.git cd whisper-web npm install
-
Run the development server:
npm run dev
-
Open the link (e.g., http://localhost:5173/) in your browser.