[Bug]: Out of VRAM if training layer filter is used for full finetuning

### What happened?

`requires_grad` is set up before the parameters are created:
https://github.com/Nerogar/OneTrainer/blob/3250a32898acc8608a21ac91f7f68f8a1ef542d2/modules/modelSetup/FluxFineTuneSetup.py#L83

This means `requires_grad` is set to True even for frozen (filtered) parameters for the first training step. They aren't trained, but gradients are created and as much vram is used as would be necessary without "fused backpass"

https://github.com/Nerogar/OneTrainer/blob/3250a32898acc8608a21ac91f7f68f8a1ef542d2/modules/modelSetup/BaseModelSetup.py#L233C13-L234C74

### What did you expect would happen?

-

### Relevant log output

```shell

```

### Generate and upload debug_report.log

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Out of VRAM if training layer filter is used for full finetuning #1241

What happened?

What did you expect would happen?

Relevant log output

Generate and upload debug_report.log

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Out of VRAM if training layer filter is used for full finetuning #1241

Description

What happened?

What did you expect would happen?

Relevant log output

Generate and upload debug_report.log

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions