-
Ton Duc Thang university
- Ho Chi Minh city, Vietnam
- in/quan-nguyen-0110631b1
- https://huggingface.co/KyS
Stars
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Finetune Llama 3.3, Mistral, Phi-4, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
Intelligent (edge and LLM) proxy for agents. Designed with fast ⚡️ LLMs for task routing, rich observability, and the seamless integration of prompts with your APIs for agentic tasks. Built by the …
Information Retrieval from Audio via Knowledge Graph
Build neural network from scratch without using ML framework
Model for MDX23 music separation contest
Noise supression using deep filtering
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
A family of diffusion models for text-to-audio generation.
The official Python library for the OpenAI API
Implementation of Nougat Neural Optical Understanding for Academic Documents
Rembg is a tool to remove images background