-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Feature: Add Z-Image-Turbo model support #8671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feature: Add Z-Image-Turbo model support #8671
Conversation
Add comprehensive support for Z-Image-Turbo (S3-DiT) models including: Backend: - New BaseModelType.ZImage in taxonomy - Z-Image model config classes (ZImageTransformerConfig, Qwen3TextEncoderConfig) - Model loader for Z-Image transformer and Qwen3 text encoder - Z-Image conditioning data structures - Step callback support for Z-Image with FLUX latent RGB factors Invocations: - z_image_model_loader: Load Z-Image transformer and Qwen3 encoder - z_image_text_encoder: Encode prompts using Qwen3 with chat template - z_image_denoise: Flow matching denoising with time-shifted sigmas - z_image_image_to_latents: Encode images to 16-channel latents - z_image_latents_to_image: Decode latents using FLUX VAE Frontend: - Z-Image graph builder for text-to-image generation - Model picker and validation updates for z-image base type - CFG scale now allows 0 (required for Z-Image-Turbo) - Clip skip disabled for Z-Image (uses Qwen3, not CLIP) - Optimal dimension settings for Z-Image (1024x1024) Technical details: - Uses Qwen3 text encoder (not CLIP/T5) - 16 latent channels with FLUX-compatible VAE - Flow matching scheduler with dynamic time shift - 8 inference steps recommended for Turbo variant - bfloat16 inference dtype
Add comprehensive LoRA support for Z-Image models including: Backend: - New Z-Image LoRA config classes (LoRA_LyCORIS_ZImage_Config, LoRA_Diffusers_ZImage_Config) - Z-Image LoRA conversion utilities with key mapping for transformer and Qwen3 encoder - LoRA prefix constants (Z_IMAGE_LORA_TRANSFORMER_PREFIX, Z_IMAGE_LORA_QWEN3_PREFIX) - LoRA detection logic to distinguish Z-Image from Flux models - Layer patcher improvements for proper dtype conversion and parameter
|
Very impressive. The model is working with acceptable performance even on my 12 GB RAM card. I notice the following message in the error log: Would it be possible to add support for the quantized models, e.g. T5B/Z-Image-Turbo-FP8 or jayn7/Z-Image-Turbo-GGUF ? |
|
I'll take a look at it and report back. |
|
I tried two huggingface LoRAs that claim to be based on z-image, but they were detected as Flux lycoris models: reverentelusarca/elusarca-anime-style-lora-z-image-turbo |
…ntification Move Flux layer structure check before metadata check to prevent misidentifying Z-Image LoRAs (which use `diffusion_model.layers.X`) as Flux AI Toolkit format. Flux models use `double_blocks` and `single_blocks` patterns which are now checked first regardless of metadata presence.
…ibility Add comprehensive support for GGUF quantized Z-Image models and improve component flexibility: Backend: - New Main_GGUF_ZImage_Config for GGUF quantized Z-Image transformers - Z-Image key detection (_has_z_image_keys) to identify S3-DiT models - GGUF quantization detection and sidecar LoRA patching for quantized models - Qwen3Encoder_Qwen3Encoder_Config for standalone Qwen3 encoder models Model Loader: - Split Z-Image model
…kuchensack/InvokeAI into feat/z-image-turbo-support
|
When running upscaling, diffusers 0.36.0.dev0 dies because the |

Add comprehensive support for Z-Image-Turbo (S3-DiT) models including:
Backend:
Invocations:
Frontend:
Technical details:
Summary
Related Issues / Discussions
QA Instructions
Merge Plan
Standard merge, no special considerations needed.
Checklist
What's Newcopy (if doing a release after this PR)