Skip to content

RuntimeError when combining FSDP and disable_adapter #1442

Open
@wanghao14

Description

@wanghao14

System Info

peft: 0.7.1; torch: 2.3.0.dev20240128+cu121; accelerate: 0.26.1; transformers: 4.37.2; Python: 3.10.12
Using the Pytorch container 23.12 provided by Nvidia.
image
The hardware environment contains four A100-40G graphics cards.

Who can help?

@pacman100 @younesbelkada

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

Hi, I want to use both FSDP and peft in my project, and I insert Lora to the pretrained LLM by peft.get_peft_model and then wrap the whole model using torch.distributed.fsdp.FullyShardedDataParallel. The only trainable part of the model is the Lora adapter. Additionally, I need to call the original model by with my_model.disable_adapter():. When running the whole code, I encounter following error(intercepted relevant parts):

File "/usr/local/lib/python3.10/dist-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 853, in forward
output = self._fsdp_wrapped_module(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in call_impl
return forward_call(*args, **kwargs)
File "/data4/Projects/CoCap/mm_video/experiments/context_compression/modeling/in_context_autoencoder.py", line 373, in forward
with self.base_llm.disable_adapter():
File "/usr/lib/python3.10/contextlib.py", line 135, in enter
return next(self.gen)
File "/usr/local/lib/python3.10/dist-packages/peft/peft_model.py", line 567, in disable_adapter
self.base_model.disable_adapter_layers()
File "/usr/local/lib/python3.10/dist-packages/peft/tuners/lora/model.py", line 403, in disable_adapter_layers
self.set_adapter_layers(enabled=False)
File "/usr/local/lib/python3.10/dist-packages/peft/tuners/lora/model.py", line 381, in set_adapter_layers
module.enable_adapters(enabled)
File "/usr/local/lib/python3.10/dist-packages/peft/tuners/tuners_utils.py", line 403, in enable_adapters
layer.requires_grad
(False)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 2435, in requires_grad

p.requires_grad
(requires_grad)
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().

Expected behavior

Using with my_model.disable_adapter(): to call the original model, even though it is wrapped by FSDP.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions