Stars
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
On-device AI across mobile, embedded and edge for PyTorch
Helpful tools and examples for working with flex-attention
Expressive Anechoic Recordings of Speech (EARS)
Zero-Shot Speech Editing and Text-to-Speech in the Wild
Inference and training library for high-quality TTS models.
NVIDIA Linux open GPU with P2P support
scalable and robust tree-based speculative decoding algorithm
Transform datasets at scale. Optimize datasets for fast AI model training.
a list of demo websites for automatic music generation research
Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
Machine Learning Engineering Open Book
PyTorch code and models for the DINOv2 self-supervised learning method.
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
Official inference library for Mistral models
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Fast and memory-efficient exact attention
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
music generation with masked transformers!
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
🔊 Text-Prompted Generative Audio Model
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
Audio generation using diffusion models, in PyTorch.
Code for the paper Hybrid Spectrogram and Waveform Source Separation