Skip to content
View asr-pub's full-sized avatar

Block or report asr-pub

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Free Motion Capture for Everyone 💀✨

Python 3,843 304 Updated Sep 11, 2025

Added vLLM support to IndexTTS for faster inference.

Python 556 72 Updated Sep 14, 2025

Text-audio foundation model from Boson AI

Python 7,259 516 Updated Aug 4, 2025

Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching

Python 600 73 Updated Sep 12, 2025

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Python 9,810 938 Updated Sep 15, 2025

[NeurIPS'24 Spotlight] Text2CAD: Generating Sequential CAD Designs from Beginner-to-Expert Level Text Prompts

Python 304 48 Updated May 15, 2025
Python 126 8 Updated Nov 22, 2024

All-In-One Music Structure Analyzer

Python 630 90 Updated May 9, 2024

[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling

Python 1,192 102 Updated Mar 2, 2025

使用vllm加速cosyvoice2的推理

Jupyter Notebook 416 54 Updated Apr 26, 2025

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,591 279 Updated Sep 15, 2025

中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。

Python 1,601 183 Updated Apr 20, 2024

从零实现一个小参数量中文大语言模型。

Python 817 90 Updated Aug 22, 2024

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 26,143 3,095 Updated Apr 30, 2025

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 8,905 788 Updated Sep 11, 2025

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 16,390 1,763 Updated Sep 12, 2025

《现代汉语词典》(第7版)全文TXT

285 46 Updated Jun 22, 2024

Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice

Python 404 57 Updated Jul 10, 2025

[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"

Python 359 25 Updated Sep 3, 2024

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 58,270 7,163 Updated Sep 13, 2025

Admin console

Go 140 16 Updated Sep 8, 2025

Official implementation of AnimateDiff.

Python 11,756 1,008 Updated Jul 31, 2024

Inference and training library for high-quality TTS models.

Python 5,413 573 Updated Dec 10, 2024

An efficient implementation of tree data structure in pure python.

Python 844 184 Updated Jul 13, 2025

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Jupyter Notebook 8,383 797 Updated Mar 15, 2025
Python 6 1 Updated Aug 24, 2024

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 12,021 1,064 Updated Jul 19, 2025

Use naive MultiheadAttention implement to replace nn.MultiheadAttention in pytorch

Python 37 2 Updated Feb 20, 2025
Next