Propagate Qwen-Image attention mask to image generation. by maromri · Pull Request #11966 · Comfy-Org/ComfyUI

maromri · 2026-01-19T10:10:52Z

Why?

For optimization purposes, it is sometimes recommended to run model inference with inputs of a fixed size. This can be supported by padding the text tokens to a fixed length, with the padding information propagated to the diffusion model using an attention mask. This mechanism is implemented in ComfyUI for many models, but not for Qwen-Image.

What?

Pass the attention mask from the text encoder to the model's forward function. This is similar to the implementation of many other models.
Convert the mask from binary 1/0 format to 0/-∞ format, to be used as an additive mask where attention is calculated as softmax(scores + mask). This is similar to the implementation in hunyuan_video, LTX and Cosmos.
Construct a joint attention mask for both text and image tokens; The text portion is copied from the attention mask passed from the text encoder, while the image portion always attends (mask = 0). This is similar to the implementation in hunyuan_video.

Running example

I used the template workflow for Qwen-Image-2512 (its lower part with 4-steps lightning LoRA) and changed only the positive prompt and the seed:

prompt: "A wintery, cloudy, Christmassy, slightly snowy day in England"
seed: 89 (fixed)

In order to add text tokens padding, I modified the initialization of Qwen25_7BVLITokenizer (the min_length argument from 1 to 256) in text_encoders/qwen_image.

The following grid presents the results with min_length=1 and min_length=256, without and with this proposed fix; we can see that with the existing implementation, which does not pass the attention mask, the model attends to the padding tokens and the content of the image shifts dramatically, while my fix encourages a much subtler change.

	without fix	with fix
`min_length = 1`
`min_length = 256`

comfy-pr-bot · 2026-01-22T03:36:37Z

Test Evidence Check

qwen_image: propagate attention mask.

f56eb56

maromri requested review from Kosinkadink, comfyanonymous and guill as code owners January 19, 2026 10:10

comfyanonymous merged commit d7f3241 into Comfy-Org:master Jan 23, 2026
12 checks passed

ahelme mentioned this pull request Jan 31, 2026

Changelogs for Migration: ComfyUI v0.8.1 through to ComfyUI v0.11.1 ahelme/comfy-multi#26

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Propagate Qwen-Image attention mask to image generation.#11966

Propagate Qwen-Image attention mask to image generation.#11966
comfyanonymous merged 1 commit intoComfy-Org:masterfrom
maromri:feature/qwen-image/propagate-attention-mask

maromri commented Jan 19, 2026 •

edited

Loading

Uh oh!

comfy-pr-bot commented Jan 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

maromri commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why?

What?

Running example

Uh oh!

comfy-pr-bot commented Jan 22, 2026

Test Evidence Check

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

maromri commented Jan 19, 2026 •

edited

Loading