Skip to content

Paradeluxe/Praditor

Repository files navigation

License GitHub Release Downloads



Praditor_icon

Praditor

A DBSCAN-Based Automation for Speech Onset Detection

Download Praditor | English · 中文 | Our Paper


Features

Praditor is a speech onset detector that helps you find out boundaries between silence and sound automatically.

Now Praditor supports VAD (Voice Activity Detection)! Check out the latest version.

audio2textgrid.png

Praditor works for both single-onset and multi-onset audio files without any language limitation. It generates output in .TextGrid format with different tier types based on the selected mode.

  • Onset/Offset Detection (Default Mode)
  • Voice Activity Detection (VAD Mode)

Praditor also allows users to adjust parameters in the Dashboard to get a better performance.

You can try test_audio.wav and test_audio_mp3.mp3 on Praditor. test_audio_mp3.mp3 is from an online resource, while test_audio.wav and test_large_audio.wav are from our own experiments.

Video instruction

📚 Fine-tuning guidance

Basic understanding is enough. Understanding the algorithm is better.

🙌 Play with GUI

Although I have prepared various buttons in this GUI, you do not have to use them all.

The simplest and easiest procedure is (1) import audio files, (2) hit the ▶︎ button, (3) [optional] you are not happy about the results, fine-tune the parameters and hit the ▶︎ button again. Until you are happy about the results, repeat step (2) and (3).

General

Praditor -> Import your target audio file (Recommend: >= 44.1 kHz; Accept: >= 8 kHz)

VAD Toggle between Onset/Offset Detection mode and Voice Activity Detection mode.

▶︎ Run algorithm and extract onsets/offsets (in default mode) or speech segments (in VAD mode). Wait for a while until the results come out. Onsets are in blue, offsets are in green.

Test Test how many onsets/offsets may be found using the presented parameters. This function does not affect .TextGrid.

If the number meets the expectation, hit ▶︎ to get the final annotation.

F5 to play the audio signal that is currently presented in the window, and Any Key to stop playing.

/ Move to the next/previous audio file

.TextGrid related

🗑️ If you want to temporarily clear the annotations, this does not delete/change the .TextGrid file. It's safe.

Show If you want the cleared annotations back. Praditor will go back to the .TextGrid and present whatever is in it.

Onset/Offset to hide/show annotations on the screen (also does not change the .TextGrid).

Note: Onsets and offsets are controlled by two DIFFERENT sets of parameters, which means there is no strict guarantee on 1-to-1 correspondence. Offset annotation is the onset annotation on the reversed audio.

Parameters

Save Modes

Display and save parameters with three priority modes:

  • File: Parameters saved with the same name as the audio file (highest priority)
  • Folder: Parameters saved in the current folder as params.txt or params_vad.txt
  • Default: Parameters saved in the application directory (lowest priority)

Buttons

Save Save the displayed parameters according to the selected mode:

  • Follows priority order: File > Folder > Default
  • Saves with .txt extension (normal mode) or _vad.txt (VAD mode)

Reset Reset the displayed parameters to the saved values from the current mode:

  • Loads parameters following the same priority order
  • Applies the loaded parameters to the UI

Backward/Forward Navigate through parameter history:

  • Maintains a history of up to 10 parameter sets for each mode
  • Backward: Go to the previous parameter set
  • Forward: Go to the next parameter set
  • Shows current position as "current/total" in the UI

VAD Mode

When VAD (Voice Activity Detection) mode is enabled:

  • Parameters are stored separately with _vad.txt suffix
  • Maintains independent parameter history for VAD mode
  • All save/reset/history functions work independently for normal and VAD modes

Audio signal

Mouse & Keyboard 🖱️⌨️

Wheel ↑/Wheel ↓ to zoom-in/out at amplitude

Ctrl/Command+Wheel ↑/Wheel ↓ to zoom-in/out at timeline (Ctrl/Command+I/O also works)

Shift+Wheel ↓/Wheel ↑ to move forward/backward in timeline

Touchpad 💻

↑✌↑/↓✌↓ to zoom-in/out at amplitude

←✌→/→✌← to zoom-in/out at timeline

Timeline zoom might not work in macOS. Use Command + I/O instead.

←←✌/✌→→ to move forward/backward in timeline

🗃️ Data and Materials

If you would like to download the datasets that were used in developing Praditor, please refer to our OSF storage.

Citation

If you use Praditor in your research, please cite the following paper:

Liu, Z., Yu, X., Hu, W.C. et al. Praditor: A DBSCAN-based automation for speech onset detection. Behav Res 57, 247 (2025). https://doi.org/10.3758/s13428-025-02776-2

Or, you can go to our paper's About this article to download .ris for whatever format you need~

🙌 Acknowledgments

Shout out to these remarkable contributors!!

  • Thank YU Xinqi, Dr. MA Yunxiao, ZHANG Sifan for their work in validating the effectiveness of Praditor's algorithm.
  • Thank HU Wing Chung for her work in packaging Praditor for macOS (arm64 and universal2)
  • Thank Prof. ZHANG Haoyun (University of Macau) and Prof. WANG Ruiming (South China Normal University) for their guidance and support for this project

Also, the funding:

  • This project was funded by the National Natural Science Foundation of China (32200845), the Science and Technology Development Fund, Macao S.A.R (FDCT, 0153/2022/A), and the Multi-Year Research Grant (MYRG2022-00148-ICI) from the University of Macau to Haoyun Zhang.

📨 Contact us

Praditor is written and maintained by Tony, Liu Zhengyuan from Centre for Cognitive and Brain Sciences, University of Macau.

If you have any questions in terms of how to use Praditor or its algorithm details, or you want me to help you write some additional scripts like export audio files, export Excel tables, feel free to contact me at zhengyuan.liu@connect.um.edu.mo or paradeluxe3726@gmail.com.