Skip to content
View hopingZ's full-sized avatar
🌴
On vacation
🌴
On vacation

Block or report hopingZ

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'

Python 79 3 Updated Mar 9, 2025

Retrieval-Augmented Theorem Provers for Lean

Python 258 56 Updated Jan 30, 2025

Fast and memory-efficient exact attention

Python 16,183 1,534 Updated Mar 9, 2025

UniCodec: a unified audio codec with a single codebook to support multi-domain audio data, including speech, music, and sound

102 2 Updated Feb 28, 2025

An neural full-band audio codec for general audio sampled at 48 kHz with 7.5 kps or 4.5 kbps.

Python 101 10 Updated Mar 7, 2025

A Conversational Speech Generation Model

5,597 182 Updated Feb 26, 2025

SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on One GPU in a Day"

Python 171 8 Updated Mar 8, 2025

Spark-TTS Inference Code

Python 2,393 248 Updated Mar 5, 2025

✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Python 285 19 Updated Jan 2, 2025

Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction

Python 149 10 Updated Feb 28, 2025

Muon optimizer: +>30% sample efficiency with <3% wallclock overhead

Python 482 25 Updated Mar 9, 2025

VoiceBench: Benchmarking LLM-Based Voice Assistants

Python 139 8 Updated Mar 10, 2025

The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.

Python 110 1 Updated Jan 2, 2025

OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.

Python 324 18 Updated Mar 6, 2025

Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper

Python 519 20 Updated Mar 10, 2025

SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis

Python 127 13 Updated Jan 1, 2025
Python 3,879 308 Updated Mar 6, 2025

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 18,894 1,357 Updated Mar 3, 2025

High-speed downloader for multiple platforms

Python 777 157 Updated Mar 7, 2025

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …

Python 5,954 615 Updated Mar 5, 2025

CPU support for xcodec2

Python 6 Updated Feb 6, 2025

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 15,392 1,697 Updated Feb 23, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 40,940 6,167 Updated Mar 10, 2025

Witness the aha moment of VLM with less than $3.

Python 3,113 245 Updated Mar 1, 2025

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…

Python 716 47 Updated Mar 5, 2025

F5-TTS 推理加速,速度提升约4倍!

Python 55 6 Updated Jan 6, 2025

Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications

Python 80 4 Updated Dec 20, 2024

Starter code for working with the YouTube-8M dataset.

Python 2,339 848 Updated Oct 25, 2021

LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python 445 34 Updated Feb 14, 2025
Next