Stars
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
HunyuanVideo: A Systematic Framework For Large Video Generation Model
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
A Collection of AIGC Research Groups
A curated list of resources about the Hokchew / Foochow language. 閩東語福州話的資源整合列表。
Official repository of In-Context LoRA for Diffusion Transformers
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
A Genetic Algorithm-Based Solver for Jigsaw Puzzles 🌀
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
[NeurIPS D&B Track 2024] Official implementation of HumanVid
Official implementation of FouriScale (ECCV2024)
⚡️Lightning-fast async download tool for bilibili and more
Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"
Pandora: Towards General World Model with Natural Language Actions and Video States
📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
[ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
A collection of resources and papers on Diffusion Models