layout	title	nav_order	permalink
default	Features	2	/FEATURES

SONATA Features 🎵🔊

SONATA offers a comprehensive suite of audio transcription and analysis features. This document provides details on each major feature.

🎙️ High-Accuracy Speech Recognition

SONATA uses WhisperX, an enhanced version of Whisper that provides:

State-of-the-art transcription accuracy across multiple languages
Word-level timestamps for precise text alignment
Support for various Whisper models (tiny, base, small, medium, large, large-v2, large-v3)
Automatic language detection capabilities
Model optimization for various hardware (CPU, CUDA, MPS)

Advanced Audio Event Detection

SONATA identifies non-speech sounds - from laughter and crying to ambient noises like traffic or music. Our system can detect over 523 different audio events with precise confidence scoring.

🔊 See complete list of detectable audio events

🌍 Multi-Language Support

SONATA supports 10 languages:

English (en)
Korean (ko)
Chinese (zh)
Japanese (ja)
French (fr)
German (de)
Spanish (es)
Italian (it)
Portuguese (pt)
Russian (ru)

👥 Speaker Diarization

Identify and label different speakers in multi-speaker audio
Set minimum and maximum speaker constraints
Integrated with PyAnnote's diarization models
Speaker-attributed transcripts with formatting options

⏱️ Rich Timestamp Information

Word-level timestamps for all transcribed content
Precise timing for audio events
Multiple output formats with varying levels of timestamp detail
Support for extracting specific time ranges

🔄 Audio Preprocessing

Audio format conversion for maximum compatibility
Silence detection and trimming to improve transcription quality
Audio segmentation for long files
Custom segment length and overlap controls

📊 Multiple Output Formats

Concise: Simple text with integrated audio event tags
Default: Text with timestamps
Extended: Includes confidence scores
JSON output with comprehensive metadata

📱 Convenient Interfaces

Python API for integration into other applications
Command-line interface for quick usage
Batch processing capabilities
Progress indicators for long-running operations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEATURES.md

FEATURES.md

SONATA Features 🎵🔊

🎙️ High-Accuracy Speech Recognition

Advanced Audio Event Detection

🌍 Multi-Language Support

👥 Speaker Diarization

⏱️ Rich Timestamp Information

🔄 Audio Preprocessing

📊 Multiple Output Formats

📱 Convenient Interfaces

Files

FEATURES.md

Latest commit

History

FEATURES.md

File metadata and controls

SONATA Features 🎵🔊

🎙️ High-Accuracy Speech Recognition

Advanced Audio Event Detection

🌍 Multi-Language Support

👥 Speaker Diarization

⏱️ Rich Timestamp Information

🔄 Audio Preprocessing

📊 Multiple Output Formats

📱 Convenient Interfaces