-
Australian National University
- Canberra
- in/xingjian-leng
Highlights
- Pro
Stars
The Gaussian Histogram Loss (HL-Gauss) proposed by Imani et al. with a few convenient wrappers for regression, in Pytorch
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch
A fork to add multimodal model training to open-r1
Official PyTorch Implementation of "Diffusion Autoencoders are Scalable Image Tokenizers"
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKT…
Quickly rewrite git repository history (filter-branch replacement)
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Janus-Series: Unified Multimodal Understanding and Generation Models
Fully open reproduction of DeepSeek-R1
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
This repo contains the code for 1D tokenizer and generator
Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"
Code for NeurIPS 2024 paper - The GAN is dead; long live the GAN! A Modern Baseline GAN - by Huang et al.
Using Low-rank adaptation to quickly fine-tune diffusion models.
Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind
A suite of image and video neural tokenizers
Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch
A linear estimator on top of clip to predict the aesthetic quality of pictures
The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"
[arXiv'25] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
High-Resolution Image Synthesis with Latent Diffusion Models