Stars
Official Pytorch implementations of "SYNTHIA: Novel Concept Design with Affordance Composition"
Nodes for image juxtaposition for Flux in ComfyUI
Official implementation for KV-Edit: Training-Free Image Editing for Precise Background Preservation
LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation
Code Implementation of "PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data"
Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening
ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations
[WWW 2025] Official PyTorch Code for "CTR-Driven Advertising Image Generation with Multimodal Large Language Models"
A framework for high-quality material transfer that allows users to adjust the degree of material application.
[ICLR2025] The code of Z-Sampling, proposed in our paper "Zigzag Diffusion Sampling: Diffusion Models Can Self-Improve via Self-Reflection".
(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis
Official implementation of the paper "Attentive Eraser: Unleashing Diffusion Model’s Object Removal Potential via Self-Attention Redirection Guidance" (AAAI 2025 Oral)
Official Repo for Paper "AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea"
HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
Align Anything: Training All-modality Model with Feedback
Solve Visual Understanding with Reinforced VLMs
Accelerating Diffusion Transformers with Token-wise Feature Caching
[Arxiv 2024] Edicho: Consistent Image Editing in the Wild
This custom_node for ComfyUI adds one-click "Virtual VRAM" for any GGUF UNet and CLIP loader, managing the offload of layers to DRAM or VRAM to maximize the latent space of your card. Also includes…
Video Generation Foundation Models: https://saiyan-world.github.io/goku/
[arXiv 2025] Official pytorch implementation of "FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors"
[CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".
Official code for VMix: Improving Text-to-Image Diffusion Model with Cross-Attention Mixing Control
[arXiv 2025] Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control
Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis