Skip to content
View asr-pub's full-sized avatar

Block or report asr-pub

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

使用vllm加速cosyvoice2的推理

Jupyter Notebook 53 4 Updated Mar 5, 2025

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,984 232 Updated Mar 6, 2025

中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。

Python 1,443 166 Updated Apr 20, 2024

从零实现一个小参数量中文大语言模型。

Python 516 62 Updated Aug 22, 2024

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 14,558 1,617 Updated Feb 23, 2025

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 7,656 615 Updated Mar 6, 2025

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 11,546 1,145 Updated Mar 6, 2025

《现代汉语词典》(第7版)全文TXT

262 43 Updated Jun 22, 2024

Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice

Python 264 35 Updated Jan 15, 2025

[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"

Python 334 21 Updated Sep 3, 2024

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 43,220 5,290 Updated Mar 6, 2025

Admin console

Go 132 12 Updated Mar 6, 2025

Official implementation of AnimateDiff.

Python 11,101 903 Updated Jul 31, 2024

Inference and training library for high-quality TTS models.

Python 5,095 536 Updated Dec 10, 2024

An efficient implementation of tree data structure in pure python.

Python 825 187 Updated Mar 2, 2025

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Jupyter Notebook 8,164 782 Updated Jun 24, 2024
Python 5 Updated Aug 24, 2024

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,915 1,059 Updated Mar 6, 2025

Use naive MultiheadAttention implement to replace nn.MultiheadAttention in pytorch

Python 33 2 Updated Feb 20, 2025

PyTorch Reimplementation of LoRA (featuring with supporting nn.MultiheadAttention)

Python 57 5 Updated Dec 5, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 41,807 4,664 Updated Mar 5, 2025

text to speech using autoregressive transformer and VITS

Python 235 17 Updated Apr 3, 2024

SOTA Open Source TTS

Python 19,756 1,525 Updated Mar 3, 2025

[ICCV'23] Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis

Python 1,148 140 Updated Feb 28, 2025

Unoffical implementation of Megatts2

Python 278 36 Updated Mar 23, 2024

The official implementation of HierSpeech++

Python 1,210 147 Updated Feb 20, 2024

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,620 116 Updated Jul 5, 2024

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

Python 1,300 122 Updated Apr 24, 2024
Next