This package aims to provide an accurate, user-friendly voice activity detector that runs in the browser. Currently, it runs Silero VAD [1] in the browser using ONNX Runtime Web.
A demo is hosted at vad-demo-script.vercel.app. The source code for the demo can be found here. A separate demo showing how to use the VAD with a bundler like webpack can be found here.
The API works as follows:
-
Create the VAD object with a line such as
const myvad = await vad.MicVAD.new(options)
options
can include any of the parameters defined here. It essentially consists of callbacks that run on every audio frame, whenever a speech start is detected, whenever speech ends, etc, as well as parameters that control the voice activity detection algorithm. -
Start and pause the VAD object as needed with
myvad.start()
andmyvad.pause()
. The object starts in the paused state.
[1] Silero Team. (2021). Silero VAD: pre-trained enterprise-grade Voice Activity Detector (VAD), Number Detector and Language Classifier. GitHub, GitHub repository, https://github.com/snakers4/silero-vad, hello@silero.ai.