Skip to content

Preliminar support for sageattention3#9047

Closed
Panchovix wants to merge 12 commits intoComfy-Org:masterfrom
Panchovix:patch-1
Closed

Preliminar support for sageattention3#9047
Panchovix wants to merge 12 commits intoComfy-Org:masterfrom
Panchovix:patch-1

Conversation

@Panchovix
Copy link

@Panchovix Panchovix commented Jul 25, 2025

Add preliminary support to sage attention 3 (https://github.com/thu-ml/SageAttention and https://huggingface.co/jt-zhang/SageAttention3) for attention.

Help in implementation is welcome, as when testing on txt2img and img2img on SDXL on a RTX 5090, on Linux, I'm getting ~10% slower speeds vs sage attention 2, so probably I'm missing something here. EDIT: Thanks to Kijai I have updated per_block_mean to False and it should make a difference.

If someone can test with video it would be great.

Also maybe it would be a good idea to have different flags on comfy.model_management and comfy.cli_args for the 2 different versions?

Not sure what are the implications of this, but it seems to make sage3 actually run,
@comfyanonymous
Copy link
Member

I don't feel like filling out approval forms so I'll wait until it's public before testing this.

For the line ending check if you sync master it should be fixed.

@pamparamm
Copy link
Contributor

@Panchovix I agree with your last point: it's probably better to separate sageattention and sageattention3 into different flags. as they have different signatures and expect different tensor shapes as well.
Also, I'm getting worse performance with sageattention3 compared to sageattention2 on SDXL on Windows, and switching per_block_mean is not helping (I'm using MSVC-compatible version from https://huggingface.co/jt-zhang/SageAttention3/discussions/5).

@Panchovix
Copy link
Author

Okay I have separated the sage attention versions with different flags:

  • --use-sage-attention for sage 1.x/2.x
  • --use-sage-attention3 for sage 3.x

Now they work separately (so you can have both installed in the venv and use the one you want).

I also still get less performance on SDXL for some reason. But I'm not sure if my implementation is not correct and that may cause that perf regression.

Kijai seems to have better performance on video, as he mentioned on https://huggingface.co/jt-zhang/SageAttention3/discussions/3#6883b71543d2651d281c9cc0, but it seems to be a custom implementation from here kijai/ComfyUI-WanVideoWrapper@a35eb7d.

Any ideas are welcome.

@comfy-pr-bot
Copy link
Member

Test Evidence Check

@Panchovix Panchovix closed this Jan 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants