hopingZ

🌴

On vacation

、、 hopingZ

🌴

On vacation

2 followers · 34 following

Achievements

Lists (1)

Sort

LightningCLI Examples

9 repositories

Stars

ajd12342 / paraspeechcaps

Codebase for 'Scaling Rich Style-Prompted Text-to-Speech Datasets'

Python 79 3 Updated Mar 9, 2025

lean-dojo / ReProver

Retrieval-Augmented Theorem Provers for Lean

Python 258 56 Updated Jan 30, 2025

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 16,183 1,534 Updated Mar 9, 2025

Jiang-Yidi / UniCodec

UniCodec: a unified audio codec with a single codebook to support multi-domain audio data, including speech, music, and sound

102 2 Updated Feb 28, 2025

facebookresearch / FlowDec

An neural full-band audio codec for general audio sampled at 48 kHz with 7.5 kps or 4.5 kbps.

Python 101 10 Updated Mar 7, 2025

SesameAILabs / csm

A Conversational Speech Generation Model

5,597 182 Updated Feb 26, 2025

slp-rl / slamkit

SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on One GPU in a Day"

Python 171 8 Updated Mar 8, 2025

SparkAudio / Spark-TTS

Spark-TTS Inference Code

Python 2,393 248 Updated Mar 5, 2025

VITA-MLLM / Freeze-Omni

✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Python 285 19 Updated Jan 2, 2025

baichuan-inc / Baichuan-Audio

Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction

Python 149 10 Updated Feb 28, 2025

KellerJordan / Muon

Muon optimizer: +>30% sample efficiency with <3% wallclock overhead

Python 482 25 Updated Mar 9, 2025

MatthewCYM / VoiceBench

VoiceBench: Benchmarking LLM-Based Voice Assistants

Python 139 8 Updated Mar 10, 2025

thuhcsi / SpeechCraft

The official repository of SpeechCraft dataset, a large-scale expressive bilingual speech dataset with natural language descriptions.

Python 110 1 Updated Jan 2, 2025

ASLP-lab / OSUM

OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.

Python 324 18 Updated Mar 6, 2025

lucidrains / native-sparse-attention-pytorch

Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper

Python 519 20 Updated Mar 10, 2025

WangHelin1997 / SSR-Speech

SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis

Python 127 13 Updated Jan 1, 2025

stepfun-ai / Step-Audio

Python 3,879 308 Updated Mar 6, 2025

OpenBMB / MiniCPM-o

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 18,894 1,357 Updated Mar 3, 2025

Johnserf-Seed / f2

High-speed downloader for multiple platforms

Python 777 157 Updated Mar 7, 2025

Zyphra / Zonos

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …

Python 5,954 615 Updated Mar 5, 2025

zhenye234 / LLaSA_inference

35 Updated Feb 8, 2025

uetuluk / xcodec2-infer-lib

CPU support for xcodec2

Python 6 Updated Feb 6, 2025

jingyaogong / minimind

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT！🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 15,392 1,697 Updated Feb 23, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 40,940 6,167 Updated Mar 10, 2025

Deep-Agent / R1-V

Witness the aha moment of VLM with less than $3.

Python 3,113 245 Updated Mar 1, 2025

FireRedTeam / FireRedASR

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…

Python 716 47 Updated Mar 5, 2025

WGS-note / F5_TTS_Faster

F5-TTS 推理加速，速度提升约4倍！

Python 55 6 Updated Jan 6, 2025

hhguo / SoCodec

Ultra-low-bitrate Speech Codec for Speech Language Modeling Applications

Python 80 4 Updated Dec 20, 2024

google / youtube-8m

Starter code for working with the YouTube-8M dataset.

Python 2,339 848 Updated Oct 25, 2021

zhenye234 / LLaSA_training

LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python 445 34 Updated Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

、、 hopingZ

Achievements

Achievements

Block or report hopingZ

Lists (1)

LightningCLI Examples

Stars

ajd12342 / paraspeechcaps

lean-dojo / ReProver

Dao-AILab / flash-attention

Jiang-Yidi / UniCodec

facebookresearch / FlowDec

SesameAILabs / csm

slp-rl / slamkit

SparkAudio / Spark-TTS

VITA-MLLM / Freeze-Omni

baichuan-inc / Baichuan-Audio

KellerJordan / Muon

MatthewCYM / VoiceBench

thuhcsi / SpeechCraft

ASLP-lab / OSUM

lucidrains / native-sparse-attention-pytorch

WangHelin1997 / SSR-Speech

stepfun-ai / Step-Audio

OpenBMB / MiniCPM-o

Johnserf-Seed / f2

Zyphra / Zonos

zhenye234 / LLaSA_inference

uetuluk / xcodec2-infer-lib

jingyaogong / minimind

vllm-project / vllm

Deep-Agent / R1-V

FireRedTeam / FireRedASR

WGS-note / F5_TTS_Faster

hhguo / SoCodec

google / youtube-8m

zhenye234 / LLaSA_training