Skip to content

Conversation

@wbruna
Copy link

@wbruna wbruna commented Oct 1, 2025

We now have the clip_cpu config parameter for that.

We now have the clip_cpu config parameter for that.
@LostRuins
Copy link
Owner

Hmm if T5 on GPU causes bugs we should still prefer CPU.

T5 is very lightweight on CPU, and runs fast imho so putting it on GPU seems like a waste of vram.

@wbruna
Copy link
Author

wbruna commented Oct 1, 2025

Hmm if T5 on GPU causes bugs we should still prefer CPU.

T5 is very lightweight on CPU, and runs fast imho so putting it on GPU seems like a waste of vram.

On very recent CPUs, perhaps. On mine (a 5-years-old 3400G), it adds ~45s of full CPU load to any Chroma generation, while on the GPU I barely notice it (even the iGPU is something like 10x faster than that). And with --sdoffloadcpu, it's easy enough to dodge the VRAM requirement, if needed.

Note this was needed because we didn't have support for the clip_cpu flag before; this PR simply restores the upstream version. And one of the reasons for that upstream warning is that T5 is often quantized, and the get_rows op doesn't play nice with k quants for most backends (crashes).

If you still prefer to play it safe, IMHO it'd be better to make --sdclipcpu on by default, so it could be turned off for backend+model combinations that don't really need it.

@LostRuins
Copy link
Owner

You make a fair point, alright I'll merge it, but make clip on cpu the default.
Which will be clearer if I simply rename the flag to --sdclipgpu since the flag is not in a release yet.

@LostRuins LostRuins merged commit ac6be8a into LostRuins:concedo_experimental Oct 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants