A description of "RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization" [NeurIPS 2024]
-
Updated
Apr 29, 2025 - Python
A description of "RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization" [NeurIPS 2024]
Python library for handling audio datasets.
A library built for easier audio self-supervised training, downstream tasks evaluation
This package aims at simplifying the download of the AudioSet dataset.
Multi-Language Dataset Cleaner/Creator for Mozilla's DeepSpeech Framework
KATube is a tool to automate the process of creating datasets for training Text-To-Speech (TTS) and Speech-To-Text (STT) models. From a list of YouTube playlists or YouTube channels, KATube will generate dataset with audios and texts.
A powerful and easy-to-use web scrapper for collecting data from the web. Supports scraping of images, text, videos, meta data, and more. Ideal for machine learning and deep learning engineers. Download and extract data with just one line of code
Add a description, image, and links to the audio-datasets topic page so that developers can more easily learn about it.
To associate your repository with the audio-datasets topic, visit your repo's landing page and select "manage topics."