-
Whu->PKU->University of Amsterdam-> LMU
- Munich
- http://taohu.me
- https://scholar.google.com/citations?user=EchdyZEAAAAJ&hl=en
- in/taohu620
- @vtaohu
Highlights
- Pro
Lists (2)
Sort Name ascending (A-Z)
Stars
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Unsupervised text tokenizer for Neural Network-based text generation.
[NeurIPS 2024] Boosting the performance of consistency models with PCM!
Does VLM Classification Benefit from LLM Description Semantics? (AAAI 2025)
The official Pytorch implementation of “BAD: Bidirectional Auto-regressive Diffusion for Text-to-Motion Generation”
A framework for few-shot evaluation of language models.
[arXiv:2406.07548] Image and Video Tokenization with Binary Spherical Quantization
official code for Diff-Instruct algorithm for one-step diffusion distillation
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
[NeurIPS 2024] Official implementation of "Unlocking the Capabilities of Masked Generative Models for Image Synthesis via Self-Guidance"
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis
Bare-bones diffusion model code
[ICLR2024] The official implementation of paper "VDT: General-purpose Video Diffusion Transformers via Mask Modeling", by Haoyu Lu, Guoxing Yang, Nanyi Fei, Yuqi Huo, Zhiwu Lu, Ping Luo, Mingyu Ding.
Codes for Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models
DistillDIFT: Distillation of Diffusion Features for Semantic Correspondence (WACV 2025)
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Official implementation of FIFO-Diffusion: Generating Infinite Videos from Text without Training (NeurIPS 2024)
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
An open source implementation of the gameNgen paper
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling