Skip to content
View unilight's full-sized avatar

Highlights

  • Pro

Block or report unilight

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

sampling frequency independent convolution for MOS prediction

Python 6 2 Updated Jul 22, 2025

Evaluation framework for speech anonymizers

Python 6 1 Updated Sep 9, 2025

Interpretable anonymizer based on kNN-VC

Jupyter Notebook 6 3 Updated Jul 28, 2025

SCOREQ: Speech COntrastive REgression for Quality Assessment (NeurIPS 2024)

Python 89 6 Updated Aug 1, 2025
Python 377 59 Updated Sep 3, 2024

Contrastive Language-Audio Pretraining

Python 1,816 184 Updated May 15, 2025

Metrics for evaluating music and audio generative models – with a focus on long-form, full-band, and stereo generations.

Python 243 20 Updated Jun 17, 2025

LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python 614 46 Updated Apr 8, 2025

Spark-TTS Inference Code

Python 10,490 1,115 Updated Apr 9, 2025

Official PyTorch implementation of BigVGAN (ICLR 2023)

Python 1,101 140 Updated Sep 5, 2024

FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3

Python 217 20 Updated Apr 20, 2024

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Python 3,787 340 Updated Jan 4, 2024
Python 10 2 Updated Apr 18, 2025

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 13,229 1,926 Updated Sep 13, 2025
Python 9 1 Updated May 14, 2025

Zero-Shot Foreign Accent Conversion without a Native Reference

Python 34 7 Updated May 1, 2024

[ASRU 2023] Code of paper SALT: Distinguishable Speaker Anonymization Through Latent Space Transformation

Python 20 2 Updated Aug 13, 2024

A Singing Style Conversion Framework Based On Audio Infilling

Python 26 3 Updated Apr 28, 2025

Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'

Python 135 6 Updated Mar 24, 2025

Multi-lingual AudioCaps

11 Updated Nov 20, 2023

Unified automatic quality assessment for speech, music, and sound.

Python 597 39 Updated Jun 5, 2025

List of speech synthesis papers.

1,057 122 Updated Jul 24, 2023

Retrieval-Augmented MOS Prediction with Prior Knowledge Integration

Python 29 3 Updated Mar 23, 2025

Python wrapper for OpenJTalk

Cython 231 79 Updated Apr 8, 2025

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 11,664 1,156 Updated Nov 14, 2024

Baseline Recipe for VoicePrivacy Challenge 2024: anonymization systems and evaluation software

Python 57 11 Updated Jan 30, 2025

Modified transcriptions of YODAS dataset

4 Updated Oct 26, 2024

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 17,750 1,870 Updated Jul 2, 2025

Versatile Evaluation of Speech and Audio

Python 321 39 Updated Sep 10, 2025

Train no-reference speech quality estimators with multiple datasets via learned, per-dataset alignments.

Python 17 Updated Aug 1, 2025
Next