PPOv2Trainer throws AttributeError: 'NoneType' object has no attribute 'modules'
because value_model
's default is None
#1976
Labels
✨ enhancement
New feature or request
🧒 good second issue
Good for contributors with basic project familiarity
🙋 help from community wanted
Open invitation for community members to contribute
🏋 PPO
Related to PPO
System Info
transformers
version: 4.44.0- distributed_type: FSDP
- mixed_precision: bf16
- use_cpu: False
- debug: True
- num_processes: 2
- machine_rank: 0
- num_machines: 1
- rdzv_backend: static
- same_network: True
- main_training_function: main
- enable_cpu_affinity: False
- fsdp_config: {'fsdp_activation_checkpointing': True, 'fsdp_auto_wrap_policy': 'TRANSFORMER_BASED_WRAP', 'fsdp_backward_prefetch': 'BACKWARD_PRE', 'fsdp_cpu_ram_efficient_loading': True, 'fsdp_forward_prefetch': True, 'fsdp_offload_params': True, 'fsdp_sharding_strategy': 'FULL_SHARD', 'fsdp_state_dict_type': 'SHARDED_STATE_DICT', 'fsdp_sync_module_states': True, 'fsdp_use_orig_params': True}
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []
- dynamo_config: {'dynamo_backend': 'EAGER'}
Information
Tasks
examples
folderReproduction
First, note that in
PPOv2Trainer
,value_model
isOptional
with default value ofNone
: https://github.com/huggingface/trl/blob/main/trl/trainer/ppov2_trainer.py#L79Second, note that
PPOv2Trainer
attempts to disable dropout for all 4 models: https://github.com/huggingface/trl/blob/main/trl/trainer/ppov2_trainer.py#L144-L145If no
value_model
is provided (default), then the errorAttributeError: 'NoneType' object has no attribute 'modules'
will be thrown becausevalue_model
is None.Expected behavior
I'm not sure what solutions the designers have in mind. Some possible solutions:
None
as the default forvalue_model
and make it non-optionalvalue_model
value_model
isNone
, then clone the reward model.I'm open to other solutions
The text was updated successfully, but these errors were encountered: