-
Notifications
You must be signed in to change notification settings - Fork 31.3k
Closed
huggingface/accelerate
#1753Description
Feature request
Currently, when training with FSDP, the Trainer expects to receive an fsdp_config argument specifying fsdp_transformer_layer_cls_to_wrap.
transformers/src/transformers/trainer.py
Lines 1394 to 1406 in 66954ea
| elif self.args.fsdp_config.get("fsdp_transformer_layer_cls_to_wrap", None) is not None: | |
| transformer_cls_to_wrap = set() | |
| for layer_class in self.args.fsdp_config["fsdp_transformer_layer_cls_to_wrap"]: | |
| transformer_cls = get_module_class_from_name(model, layer_class) | |
| if transformer_cls is None: | |
| raise Exception("Could not find the transformer layer class to wrap in the model.") | |
| else: | |
| transformer_cls_to_wrap.add(transformer_cls) | |
| auto_wrap_policy = functools.partial( | |
| transformer_auto_wrap_policy, | |
| # Transformer layer class to wrap | |
| transformer_layer_cls=transformer_cls_to_wrap, | |
| ) |
I am wondering if we can set this automatically, when the model has a _no_split_modules attribute, e.g.
| _no_split_modules = ["OPTDecoderLayer"] |
Motivation
It would be a convenient feature to set this automatically. This argument is model-specific, but it might be nice to define training arguments independently of a specific model type.
Your contribution
Happy to help make a PR. Would be great if you can confirm whether this would be desirable or if I am misunderstanding something. Thanks!
Metadata
Metadata
Assignees
Labels
No labels