-
Notifications
You must be signed in to change notification settings - Fork 2.7k
feat: Implement spectrogram visualization for AudioPlus #7400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
feat: Implement spectrogram visualization for AudioPlus #7400
Conversation
This commit adds spectrogram visualization capabilities to the audio editor through a new optional 'spectrogram' property in the AudioPlus component. Example usage: <AudioPlus name="audio" value="$audio" height="240" hotkey="space" defaultscale="1" defaultzoom="2" zoom="true" spectrogram="true" sync="group_a" /> Key changes: - Add new 'spectrogram' boolean property to AudioPlus component - Extract window functions into a dedicated WindowFunctions module - Create a new ColorMapper module for spectrogram coloring - Refactor Visualizer class to use the new modules - Add support for different window functions and color schemes - Improve type safety and code organization The spectrogram visualization allows users to: - Toggle spectrogram view using the 'spectrogram' property - View frequency content over time alongside waveform - Switch between different color schemes - Configure window functions for FFT analysis - Adjust visualization parameters (FFT size, dB range) Configuration: - spectrogram: boolean (optional) - When set to true, enables spectrogram visualization alongside the waveform Labels: audio, editor, feature, community:feature-request, community:reviewed Closes HumanSignal#384
Add spectrogram visualization capabilities to the audio editor component with configurable settings and improved UI controls. Key changes: - Extract window functions into separate WindowFunctions module for better code organization - Create new ColorMapper module for handling spectrogram color schemes - Add spectrogram property to AudioPlus component (optional boolean to enable/disable) - Implement FFT-based spectrogram rendering with configurable parameters - Add UI controls for spectrogram settings (FFT size, color scheme, dB range) - Fix CSS styling issues in the configuration modal - Improve section header positioning and spacing Features: - Real-time spectrogram visualization - Configurable FFT window size and type - Multiple color scheme options - Adjustable dB range for visualization - Mel-scale frequency mapping support - Responsive rendering with performance optimizations Labels: - audio - community:feature-request - community:reviewed - editor - feature Closes HumanSignal#384
👷 Deploy request for heartex-docs pending review.Visit the deploys page to approve it
|
👷 Deploy request for label-studio-docs-new-theme pending review.Visit the deploys page to approve it
|
✅ Deploy Preview for label-studio-storybook ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
To help visualize the new spectrogram functionality implemented in this PR (#7400), I've recorded a short video demonstration: Video Demonstration: Spectrogram Feature What the video shows: The video walks through the spectrogram feature within the Label Studio interface, highlighting:
Hope this provides a helpful overview of the user experience! |
Great PR! How well will it work with long audio files around 1-2 hours? |
Hey @makseq, TL; DR: Yes! it handles long files (1-2 hours) efficiently. The core strategies implemented are:
This approach balances performance, memory, and visual overview. As you zoom in, the detail naturally increases as fewer samples are represented per pixel. Separately, the chosen FFT window size affects the computation time per slice (larger FFTs = more detail but slower slice render). This characteristic is independent of total file length. For the most fluid feel, 512 is often a good balance. To demonstrate this with varied audio content, the video uses a 1-hour file created by concatenating samples from the ESC-50 dataset (https://github.com/karolpiczak/ESC-50). This dataset contains 2000 short environmental sound recordings across 50 categories (like dogs barking, rain, helicopters, etc.), ensuring the test file has diverse spectral characteristics. Video Demo: Spectrogram Performance & FFT Size Impact (1hr ESC-50 file) (Video shows loading/panning the long, varied file & the visible speed difference when switching FFT sizes). |
@cloudmark please rebase your branch on the latest changes from repo to include this commit 9b0487f. It will fix failing checks. |
|
Thank you @makseq for the heads up. I think internally they should resolve to the same component so there are no further updates needed (I believe). |
Spectrogram visualization to Audio Component
Reason for change
This PR adds spectrogram visualization support to the audio editor, enabling users to visualize frequency content over time in audio recordings. This feature enhances audio annotation capabilities by providing visual frequency analysis tools, particularly useful for tasks like speech analysis, music transcription, and sound event detection.
The implementation includes:
Screenshots
Shows the labeling interface configuration with the new
spectrogram="true"
property in the XML configuration, demonstrating how the feature can be enabled through the labeling interface.Demonstrates the color scheme selection interface with:
Shows interactive features:
Comprehensive control panel featuring:
Detailed configuration options:
Rollout strategy
The feature is implemented with a progressive enhancement approach:
Testing
Comprehensive testing strategy:
Risks
Reviewer notes
Key areas to review:
Visualizer.ts
: Spectrogram rendering logicWindowFunctions.ts
: Audio processing utilitiesColorMapper.ts
: Color scheme managementGeneral notes
The spectrogram visualization feature provides: