quziyan

quziyan

IECAS NUDT INTEL-ILC WoTIan-Capital Baidu-IDL MI Tencent

18 followers · 40 following

Beijing

Stars

Audio

32 repositories

facebookresearch / audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Jupyter Notebook 21,310 2,208 Updated Jan 15, 2025

coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 36,856 4,559 Updated Aug 16, 2024

voicepaw / so-vits-svc-fork

so-vits-svc fork with realtime support, improved interface and more features.

Python 8,858 1,182 Updated Jan 13, 2025

PlayVoice / whisper-vits-svc

Core Engine of Singing Voice Conversion & Singing Voice Clone

Python 2,712 922 Updated Apr 23, 2024

yl4579 / StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 5,237 452 Updated Aug 10, 2024

THUDM / GLM-4-Voice

GLM-4-Voice | 端到端中英语音对话模型

Python 2,562 207 Updated Dec 5, 2024

facebookresearch / muavic

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation

Python 374 32 Updated Sep 11, 2023

WhisperSpeech / WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.

Jupyter Notebook 4,078 225 Updated Dec 12, 2024

fishaudio / Bert-VITS2

vits2 backbone with multilingual-bert

Python 8,182 1,156 Updated Jan 13, 2025

daniilrobnikov / vits2

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

Jupyter Notebook 528 53 Updated Sep 11, 2023

fishaudio / fish-speech

SOTA Open Source TTS

Python 18,357 1,375 Updated Jan 12, 2025

rany2 / edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

Python 6,808 670 Updated Dec 26, 2024

jianchang512 / stt

Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具，输出json、srt字幕、纯文字格式

Python 2,821 308 Updated Dec 5, 2024

jianchang512 / ChatTTS-ui

一个简单的本地网页界面，使用ChatTTS将文字合成为语音，同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.

Python 6,529 778 Updated Dec 9, 2024

BytedanceSpeech / seed-tts-eval

Python 1,116 109 Updated Jun 14, 2024

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 33,656 3,654 Updated Jan 13, 2025

NTT123 / light-speed

A modified VITS that utilizes phoneme duration's ground truth for better robustness

Python 122 37 Updated Aug 27, 2023

p0p4k / vits2_pytorch

unofficial vits2-TTS implementation in pytorch

Python 504 95 Updated Mar 28, 2024

BasedHardware / omi

AI wearables

C 3,986 524 Updated Jan 15, 2025

ricky0123 / vad

Voice activity detector (VAD) for the browser with a simple API

TypeScript 1,029 159 Updated Jan 9, 2025

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 9,594 930 Updated Jan 15, 2025

wq2012 / awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

1,668 229 Updated Oct 16, 2024

openai / whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Python 74,356 8,879 Updated Jan 4, 2025

modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 7,725 809 Updated Jan 15, 2025

MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Jupyter Notebook 3,986 360 Updated Dec 18, 2024

suno-ai / bark

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 36,677 4,311 Updated Aug 19, 2024

abus-aikorea / voice-pro

Comprehensive Gradio WebUI for audio processing, powered by Whisper engines (Whisper, Faster-Whisper, Whisper-Timestamped). Features Voice Changer, zero-shot Voice Cloning (E2, F5-TTS), YouTube dow…

Python 2,531 190 Updated Dec 22, 2024

Huanshere / VideoLingo

Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音，一键全自动视频搬运AI字幕组

Python 9,184 897 Updated Jan 5, 2025

WEIFENG2333 / AsrTools

Python 1,606 141 Updated Nov 13, 2024

jianchang512 / clone-voice

A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具，使用你的音色或任意声音来录制音频

Python 7,867 820 Updated Dec 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly