A DBSCAN-Based Automation for Speech Onset Detection
Download Praditor | English · 中文 | Our Paper
Praditor is a speech onset detector that helps you find out boundaries between silence and sound automatically.
Now Praditor supports VAD (Voice Activity Detection)! Check out the latest version.
Praditor works for both single-onset and multi-onset audio files without any language limitation. It generates output in .TextGrid format with different tier types based on the selected mode.
- Onset/Offset Detection (Default Mode)
- Voice Activity Detection (VAD Mode)
Praditor also allows users to adjust parameters in the Dashboard to get a better performance.
You can try test_audio.wav and test_audio_mp3.mp3 on Praditor. test_audio_mp3.mp3 is from an online resource, while test_audio.wav and test_large_audio.wav are from our own experiments.
Basic understanding is enough. Understanding the algorithm is better.
- Basic knowledge: Go to the first section of Quick Fix.
- Advanced knowledge: Go to the second section of Quick Fix (i.e., Detailed Introduction).
- Expert knowledge: Go to Parameter.
Although I have prepared various buttons in this GUI, you do not have to use them all.
The simplest and easiest procedure is (1) import audio files, (2) hit the ▶︎ button,
(3) [optional] you are not happy about the results, fine-tune the parameters and hit the ▶︎ button again.
Until you are happy about the results, repeat step (2) and (3).
Praditor -> Import your target audio file (Recommend: >= 44.1 kHz; Accept: >= 8 kHz)
VAD Toggle between Onset/Offset Detection mode and Voice Activity Detection mode.
▶︎ Run algorithm and extract onsets/offsets (in default mode) or speech segments (in VAD mode). Wait for a while until the results come out. Onsets are in blue, offsets are in green.
Test Test how many onsets/offsets may be found using the presented parameters. This function does not affect .TextGrid.
If the number meets the expectation, hit
▶︎to get the final annotation.
F5 to play the audio signal that is currently presented in the window, and Any Key to stop playing.
←/→ Move to the next/previous audio file
🗑️ If you want to temporarily clear the annotations, this does not delete/change the .TextGrid file. It's safe.
Show If you want the cleared annotations back. Praditor will go back to the .TextGrid and present whatever is in it.
Onset/Offset to hide/show annotations on the screen (also does not change the .TextGrid).
Note: Onsets and offsets are controlled by two DIFFERENT sets of parameters, which means there is no strict guarantee on 1-to-1 correspondence. Offset annotation is the onset annotation on the reversed audio.
Display and save parameters with three priority modes:
- File: Parameters saved with the same name as the audio file (highest priority)
- Folder: Parameters saved in the current folder as
params.txtorparams_vad.txt - Default: Parameters saved in the application directory (lowest priority)
Save Save the displayed parameters according to the selected mode:
- Follows priority order: File > Folder > Default
- Saves with
.txtextension (normal mode) or_vad.txt(VAD mode)
Reset Reset the displayed parameters to the saved values from the current mode:
- Loads parameters following the same priority order
- Applies the loaded parameters to the UI
Backward/Forward Navigate through parameter history:
- Maintains a history of up to 10 parameter sets for each mode
Backward: Go to the previous parameter setForward: Go to the next parameter set- Shows current position as "current/total" in the UI
When VAD (Voice Activity Detection) mode is enabled:
- Parameters are stored separately with
_vad.txtsuffix - Maintains independent parameter history for VAD mode
- All save/reset/history functions work independently for normal and VAD modes
Wheel ↑/Wheel ↓ to zoom-in/out at amplitude
Ctrl/Command+Wheel ↑/Wheel ↓ to zoom-in/out at timeline (Ctrl/Command+I/O also works)
Shift+Wheel ↓/Wheel ↑ to move forward/backward in timeline
↑✌↑/↓✌↓ to zoom-in/out at amplitude
←✌→/→✌← to zoom-in/out at timeline
Timeline zoom might not work in macOS. Use
Command + I/Oinstead.
←←✌/✌→→ to move forward/backward in timeline
If you would like to download the datasets that were used in developing Praditor, please refer to our OSF storage.
If you use Praditor in your research, please cite the following paper:
Liu, Z., Yu, X., Hu, W.C. et al. Praditor: A DBSCAN-based automation for speech onset detection. Behav Res 57, 247 (2025). https://doi.org/10.3758/s13428-025-02776-2
Or, you can go to our paper's About this article to download .ris for whatever format you need~
Shout out to these remarkable contributors!!
- Thank YU Xinqi, Dr. MA Yunxiao, ZHANG Sifan for their work in validating the effectiveness of Praditor's algorithm.
- Thank HU Wing Chung for her work in packaging Praditor for macOS (arm64 and universal2)
- Thank Prof. ZHANG Haoyun (University of Macau) and Prof. WANG Ruiming (South China Normal University) for their guidance and support for this project
Also, the funding:
- This project was funded by the National Natural Science Foundation of China (32200845), the Science and Technology Development Fund, Macao S.A.R (FDCT, 0153/2022/A), and the Multi-Year Research Grant (MYRG2022-00148-ICI) from the University of Macau to Haoyun Zhang.
Praditor is written and maintained by Tony, Liu Zhengyuan from Centre for Cognitive and Brain Sciences, University of Macau.
If you have any questions in terms of how to use Praditor or its algorithm details, or you want me to help you write some additional
scripts like export audio files, export Excel tables,
feel free to contact me at zhengyuan.liu@connect.um.edu.mo or paradeluxe3726@gmail.com.
