Skip to content

[REQUEST] Add automatic logging of parallelism and ZeRO config to WandbMonitor #7494

@WoosungMyung

Description

@WoosungMyung

Description

Currently, WandbMonitor in deepspeed/monitor/wandb.py initializes W&B and supports logging.
I would like to suggest adding a simple method (or automatic hook) in WandbMonitor that updates the W&B config with the core distributed training settings after initialization is complete. (Parallelism Rank & Zero Config)

Proposed approach

Add a method such as monitor.update_config(engine) (or update_config_once()) that extracts these values from the DeepSpeed engine (or other getter method) and updates W&B config once the engine is fully initialized.

This should keep overhead minimal while adding significant value (experiment settings) for users using distributed learning.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions