Skip to content
View a897456's full-sized avatar

Block or report a897456

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Speech enhancement using Wiener filtering and pitch-synchronous STFT phase reconstruction

MATLAB 1 3 Updated Sep 12, 2020

The MOS system combines components from DNSMOS, NISQA, MOSSSL, and SIGMOS, using the librosa library to process audio waveforms.

Jupyter Notebook 12 4 Updated Feb 16, 2024

Denoising Diffusion Probabilistic Models

Python 3,662 362 Updated Aug 29, 2023

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation

Python 459 71 Updated Aug 1, 2024

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.

Python 756 111 Updated Mar 26, 2024

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Python 1,126 102 Updated Jul 11, 2024

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 1 Updated Jun 24, 2024
Python 149 31 Updated Dec 20, 2023

A PyTorch-based Speech Toolkit

Python 8,595 1,363 Updated Sep 19, 2024

An Open-source Streaming High-fidelity Neural Audio Codec

Python 410 21 Updated Jun 15, 2024
Python 9 3 Updated Jul 19, 2024

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 2 Updated Apr 1, 2024

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 4,471 384 Updated Sep 6, 2024

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

Python 346 30 Updated Jan 25, 2024

High-Resolution Image Synthesis with Latent Diffusion Models

Jupyter Notebook 11,510 1,502 Updated Feb 29, 2024

A latent text-to-image diffusion model

Jupyter Notebook 67,594 10,088 Updated Jun 18, 2024

Audio generation using diffusion models, in PyTorch.

Python 1,922 167 Updated Jun 12, 2023

A fully working pytorch implementation of NaturalSpeech (Tan et al., 2022)

Python 466 67 Updated Feb 7, 2024

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Python 1,265 99 Updated Sep 24, 2023

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

Python 221 47 Updated Mar 14, 2023

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

Jupyter Notebook 338 38 Updated Jul 12, 2024

On Variational Learning of Controllable Representations for Text without Supervision https://arxiv.org/abs/1905.11975

Roff 27 7 Updated Nov 18, 2020

Audio Generation model working with GPT-2 and VQVAE compressed representation of MelSpectrograms

Python 18 2 Updated Oct 8, 2023

VQVAE compression for MelSpectrograms

Python 8 2 Updated Feb 9, 2023

Automatic Speaker Recognition (ASR) system using Mel Frequency Cepstral Coefficients (MFCC's) and Vector Quantization (VQ)

MATLAB 1 Updated Dec 30, 2020

Front-end speech processing aims at extracting proper features from short- term segments of a speech utterance, known as frames. It is a pre-requisite step toward any pattern recognition problem em…

Python 238 63 Updated Mar 3, 2023

Implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"

Python 325 61 Updated Jul 21, 2024
Next