PPOv2Trainer throws `AttributeError: 'NoneType' object has no attribute 'modules'` because `value_model`'s default is `None` #1976

RylanSchaeffer · 2024-08-26T15:35:23Z

System Info

transformers version: 4.44.0
Platform: Linux-5.4.0-162-generic-x86_64-with-glibc2.31
Python version: 3.11.9
Huggingface_hub version: 0.23.4
Safetensors version: 0.4.3
Accelerate version: 0.32.1
Accelerate config: - compute_environment: LOCAL_MACHINE
- distributed_type: FSDP
- mixed_precision: bf16
- use_cpu: False
- debug: True
- num_processes: 2
- machine_rank: 0
- num_machines: 1
- rdzv_backend: static
- same_network: True
- main_training_function: main
- enable_cpu_affinity: False
- fsdp_config: {'fsdp_activation_checkpointing': True, 'fsdp_auto_wrap_policy': 'TRANSFORMER_BASED_WRAP', 'fsdp_backward_prefetch': 'BACKWARD_PRE', 'fsdp_cpu_ram_efficient_loading': True, 'fsdp_forward_prefetch': True, 'fsdp_offload_params': True, 'fsdp_sharding_strategy': 'FULL_SHARD', 'fsdp_state_dict_type': 'SHARDED_STATE_DICT', 'fsdp_sync_module_states': True, 'fsdp_use_orig_params': True}
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []
- dynamo_config: {'dynamo_backend': 'EAGER'}
PyTorch version (GPU?): 2.4.0+cu121 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?: No
Using GPU in script?: Yes
GPU type: NVIDIA A100-SXM4-80GB

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder
My own task or dataset (give details below)

Reproduction

First, note that in PPOv2Trainer, value_model is Optional with default value of None: https://github.com/huggingface/trl/blob/main/trl/trainer/ppov2_trainer.py#L79

Second, note that PPOv2Trainer attempts to disable dropout for all 4 models: https://github.com/huggingface/trl/blob/main/trl/trainer/ppov2_trainer.py#L144-L145

If no value_model is provided (default), then the error AttributeError: 'NoneType' object has no attribute 'modules' will be thrown because value_model is None.

Expected behavior

I'm not sure what solutions the designers have in mind. Some possible solutions:

Remove None as the default for value_model and make it non-optional
Change the default for value_model
If value_model is None, then clone the reward model.

I'm open to other solutions

The text was updated successfully, but these errors were encountered:

qgallouedec · 2024-08-26T16:44:46Z

We avoid using downloadable files for security reasons. Please share your solution directly on GitHub as a code snippet or a pull request, so the community can review it.

RylanSchaeffer · 2024-08-29T19:05:54Z

@qgallouedec can I please bump this?

qgallouedec · 2024-10-07T15:25:52Z

For the record, there's a common misunderstanding regarding the use of Optional[Type] in type hints. The term "Optional" doesn't mean that the argument is truly optional (i.e., that it can be omitted). It means the argument can either be None or of the specified type (Type).

examples:

def example(x: int = 0): # x is optional, and can't be None

def example(x: Optional[int]): # x is not optional, and can be None

On the other hand, just because a type hint allows None, it doesn't guarantee that None is a valid value at runtime. For example:

def square(x: float):  # type hint allows any float, but the function only works for non-negative floats
    return x ** 0.5

In this specific case with PPOv2Trainer, value_model is technically allowed to be None based on its type hint (Optional), but the code doesn't properly handle None. So while the type hint suggests that None has a valid type, it's not a supported case in practice.

Remove None as the default for value_model and make it non-optional

This is a possible solution, but there's a practical reason for keeping None as the default. Having None allows us to maintain a clean and logical grouping of related arguments. For example, keeping train_dataset (which is required) next to eval_dataset (which is optional) makes the API easier to read and understand.

qgallouedec · 2024-10-07T15:29:17Z

If value_model is None, then clone the reward model.

This is probably an interesting improvement.

RylanSchaeffer added the 🐛 bug Something isn't working label Aug 26, 2024

RylanSchaeffer changed the title ~~PPOv2Trainer throws error with value_model: Optional[nn.Module] = None~~ PPOv2Trainer throws AttributeError: 'NoneType' object has no attribute 'modules' because value_model's default is None Aug 26, 2024

huggingface deleted a comment from MinecraftEarthVillage Aug 26, 2024

qgallouedec closed this as completed Oct 7, 2024

qgallouedec reopened this Oct 7, 2024

qgallouedec added ✨ enhancement New feature or request 🧒 good second issue Good for contributors with basic project familiarity 🏋 PPO Related to PPO and removed 🐛 bug Something isn't working labels Oct 7, 2024

qgallouedec added the 🙋 help from community wanted Open invitation for community members to contribute label Oct 20, 2024

This comment has been minimized.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PPOv2Trainer throws `AttributeError: 'NoneType' object has no attribute 'modules'` because `value_model`'s default is `None` #1976

PPOv2Trainer throws `AttributeError: 'NoneType' object has no attribute 'modules'` because `value_model`'s default is `None` #1976

RylanSchaeffer commented Aug 26, 2024 •

edited

Loading

qgallouedec commented Aug 26, 2024 •

edited

Loading

RylanSchaeffer commented Aug 29, 2024

qgallouedec commented Oct 7, 2024

qgallouedec commented Oct 7, 2024

This comment has been minimized.

PPOv2Trainer throws AttributeError: 'NoneType' object has no attribute 'modules' because value_model's default is None #1976

PPOv2Trainer throws AttributeError: 'NoneType' object has no attribute 'modules' because value_model's default is None #1976

Comments

RylanSchaeffer commented Aug 26, 2024 • edited Loading

System Info

Information

Tasks

Reproduction

Expected behavior

qgallouedec commented Aug 26, 2024 • edited Loading

RylanSchaeffer commented Aug 29, 2024

qgallouedec commented Oct 7, 2024

qgallouedec commented Oct 7, 2024

This comment has been minimized.

PPOv2Trainer throws `AttributeError: 'NoneType' object has no attribute 'modules'` because `value_model`'s default is `None` #1976

PPOv2Trainer throws `AttributeError: 'NoneType' object has no attribute 'modules'` because `value_model`'s default is `None` #1976

RylanSchaeffer commented Aug 26, 2024 •

edited

Loading

qgallouedec commented Aug 26, 2024 •

edited

Loading