Skip to content

Conversation

@alexheretic
Copy link
Contributor

@alexheretic alexheretic commented Apr 23, 2025

Enhance exception handling to treat RuntimeException "HIP error: out of memory" the same as OOM_EXCEPTION.

Resolves #7761

@alexheretic
Copy link
Contributor Author

This also means I can use Load Clip/device = default instead of "cpu", since the OOM is now handled and tiled vae works.

Perhaps WanImageToVideo should have a tiled option somehow too.

@alexheretic
Copy link
Contributor Author

I've update to pytorch 2.7.1+rocm6.3 and this still seems valuable for my gfx1100. It makes sense that all variants of OOM should be handled.

Is there some issue with this approach, or other reason not to merge this?

@alexheretic alexheretic changed the title VAE handle HIP OOM exceptions Handle HIP OOM exceptions Jun 21, 2025
@alexheretic
Copy link
Contributor Author

This is still merge conflict free, want it or shall I close? cc @comfyanonymous @Kosinkadink

@alexheretic
Copy link
Contributor Author

Could i get a comment on this, perhaps a short description on why it isn't suitable? Perhaps i could rework it adapt it in some way that still allows oom fallbacks to work on rocm. @comfyanonymous @Kosinkadink

@comfyanonymous
Copy link
Owner

Why don't you set the model_management.OOM_EXCEPTION to whatever the hip oom exception is?

@alexheretic
Copy link
Contributor Author

Why don't you set the model_management.OOM_EXCEPTION to whatever the hip oom exception is?

The error this handles is a generic RuntimeError with just the message identifying it as an oom. Matching all RuntimeErrors could include non-oom ones.

RuntimeError: HIP error: out of memory

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WanImageToVideo: HIP error: out of memory

2 participants