Preliminar support for sageattention3#9047
Preliminar support for sageattention3#9047Panchovix wants to merge 12 commits intoComfy-Org:masterfrom
Conversation
Not sure what are the implications of this, but it seems to make sage3 actually run,
|
I don't feel like filling out approval forms so I'll wait until it's public before testing this. For the line ending check if you sync master it should be fixed. |
|
@Panchovix I agree with your last point: it's probably better to separate sageattention and sageattention3 into different flags. as they have different signatures and expect different tensor shapes as well. |
|
Okay I have separated the sage attention versions with different flags:
Now they work separately (so you can have both installed in the venv and use the one you want). I also still get less performance on SDXL for some reason. But I'm not sure if my implementation is not correct and that may cause that perf regression. Kijai seems to have better performance on video, as he mentioned on https://huggingface.co/jt-zhang/SageAttention3/discussions/3#6883b71543d2651d281c9cc0, but it seems to be a custom implementation from here kijai/ComfyUI-WanVideoWrapper@a35eb7d. Any ideas are welcome. |
Test Evidence Check |
Add preliminary support to sage attention 3 (https://github.com/thu-ml/SageAttention and https://huggingface.co/jt-zhang/SageAttention3) for attention.
Help in implementation is welcome, as when testing on txt2img and img2img on SDXL on a RTX 5090, on Linux, I'm getting ~10% slower speeds vs sage attention 2, so probably I'm missing something here. EDIT: Thanks to Kijai I have updated per_block_mean to False and it should make a difference.
If someone can test with video it would be great.
Also maybe it would be a good idea to have different flags on comfy.model_management and comfy.cli_args for the 2 different versions?