RuntimeError when combining FSDP and disable_adapter

### System Info

peft: 0.7.1; torch: 2.3.0.dev20240128+cu121; accelerate: 0.26.1; transformers: 4.37.2; Python: 3.10.12
Using the [Pytorch container 23.12](https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-23-12.html#rel-23-12) provided by Nvidia. 
<img width="764" alt="image" src="https://github.com/huggingface/peft/assets/31157833/87a8167d-e046-4afb-a04a-80d7b63f2478">
The hardware environment contains four A100-40G graphics cards.


### Who can help?

@pacman100 @younesbelkada 

### Information

- [X] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder
- [X] My own task or dataset (give details below)

### Reproduction

Hi, I want to use both FSDP and peft in my project, and I insert Lora to the pretrained LLM by `peft.get_peft_model` and then wrap the whole model using `torch.distributed.fsdp.FullyShardedDataParallel`. The only trainable part of the model is the Lora adapter. Additionally, I need to call the original model by `with my_model.disable_adapter():`. When running the whole code, I encounter following error(intercepted relevant parts):

> File "/usr/local/lib/python3.10/dist-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 853, in forward
>     output = self._fsdp_wrapped_module(*args, **kwargs)
>   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
>     return self._call_impl(*args, **kwargs)
>   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
>     return forward_call(*args, **kwargs)
>   File "/data4/Projects/CoCap/mm_video/experiments/context_compression/modeling/in_context_autoencoder.py", line 373, in forward
>     with self.base_llm.disable_adapter():
>   File "/usr/lib/python3.10/contextlib.py", line 135, in __enter__
>     return next(self.gen)
>   File "/usr/local/lib/python3.10/dist-packages/peft/peft_model.py", line 567, in disable_adapter
>     self.base_model.disable_adapter_layers()
>   File "/usr/local/lib/python3.10/dist-packages/peft/tuners/lora/model.py", line 403, in disable_adapter_layers
>     self._set_adapter_layers(enabled=False)
>   File "/usr/local/lib/python3.10/dist-packages/peft/tuners/lora/model.py", line 381, in _set_adapter_layers
>     module.enable_adapters(enabled)
>   File "/usr/local/lib/python3.10/dist-packages/peft/tuners/tuners_utils.py", line 403, in enable_adapters
>     layer.requires_grad_(False)
>   File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 2435, in requires_grad_
>     p.requires_grad_(requires_grad)
> RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().

### Expected behavior

Using `with my_model.disable_adapter():` to call the original model, even though it is wrapped by FSDP.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RuntimeError when combining FSDP and disable_adapter #1442

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RuntimeError when combining FSDP and disable_adapter #1442

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions