Skip to content

[Bug]: Out of VRAM if training layer filter is used for full finetuning #1241

@dxqb

Description

@dxqb

What happened?

requires_grad is set up before the parameters are created:

self.__setup_requires_grad(model, config)

This means requires_grad is set to True even for frozen (filtered) parameters for the first training step. They aren't trained, but gradients are created and as much vram is used as would be necessary without "fused backpass"

https://github.com/Nerogar/OneTrainer/blob/3250a32898acc8608a21ac91f7f68f8a1ef542d2/modules/modelSetup/BaseModelSetup.py#L233C13-L234C74

What did you expect would happen?

Relevant log output

Generate and upload debug_report.log

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions