A generative speech model for daily dialogue.
-
Updated
May 23, 2025 - Python
A generative speech model for daily dialogue.
VoxNovel: generate audiobooks giving each character a different voice actor.
Pitch-shift audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.
Time-stretch audio clips quickly with PyTorch (CUDA supported)! Additional utilities for searching efficient transformations are included.
Speech command classification on Speech-Command v0.02 dataset using PyTorch and torchaudio. In this example, three models have been trained using the raw signal waveforms, MFCC features and MelSpectogram features.
High fidelity music synthesis using diffusion and UnivNet.
A utility for wrapping the Free Spoken Digit Dataset into PyTorch-ready data set splits.
Open Translator: Speech To Speech and Speech to text Translator with voice cloning and other cool features
TTS (FastPitch) for German (Thorsten voice / emotional)
Speech to Text with Wav2Vec2 using torchaudio
Experiments in neural networks for audio generation.
🤖 Telegram bot powered by Deep Learning. Automatically assesses the safety of audios and voice messages for people suffering from misophonia.
Utilities for preprocessing the Switchboard and WSJ corpora in Python3
The Voice Cloner is a Python-based project that leverages Tacotron 2 and WaveGlow models for text-to-speech (TTS) synthesis and basic voice cloning. This project supports 22 official Indian languages, including Sanskrit, making it versatile for multilingual text input.
Mixture of experts architecture for speech-to-text and language identification, built in PyTorch
The unmix model trained to separate guitar playing from audio samples using a custom-built dataset.
cnn-based model for audio trained on cpu using pytorch
Streamlit-based demo of our project on Deep Learning for music genre classification as part of the Numerical Analysis for Machine Learning course at Politecnico di Milano, A.Y 2022-2023.
Music genre classification project as part of the Numerical Analysis for Machine Learning course at Politecnico di Milano, A.Y 2022-2023.
The road sign recognition system of the Russian Federation, which uses an already prepared model for object detection and image segmentation in real time to improve road safety
Add a description, image, and links to the torchaudio topic page so that developers can more easily learn about it.
To associate your repository with the torchaudio topic, visit your repo's landing page and select "manage topics."