Stars
Unofficial Implementation of Selective Attention Transformer
Janus-Series: Unified Multimodal Understanding and Generation Models
[ICLR2025] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
[ECCV 2024] GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
🤗 smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.
JAX Implementation of Black Forest Labs' Flux.1 family of models
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
PB-LLM: Partially Binarized Large Language Models
Binarized Neural Network (BNN) for pytorch
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
G. Peyré, L. Chizat, F-X. Vialard, J. Solomon, Quantum Optimal Transport for Tensor Field Processing, Arxiv, 2016
Repository for NPHardEval, a quantified-dynamic benchmark of LLMs
Train transformer language models with reinforcement learning.
A machine learning compiler for GPUs, CPUs, and ML accelerators
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Elucidating the Design Space of Diffusion-Based Generative Models (EDM)
A simple, easy-to-understand library for diffusion models using Flax and Jax. Includes detailed notebooks on DDPM, DDIM, and EDM with simplified mathematical explanations. Made as part of my journe…
EVA Series: Visual Representation Fantasies from BAAI
Unofficial JAX implementations of deep learning research papers