You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, WandbMonitor in deepspeed/monitor/wandb.py initializes W&B and supports logging.
I would like to suggest adding a simple method (or automatic hook) in WandbMonitor that updates the W&B config with the core distributed training settings after initialization is complete. (Parallelism Rank & Zero Config)
Proposed approach
Add a method such as monitor.update_config(engine) (or update_config_once()) that extracts these values from the DeepSpeed engine (or other getter method) and updates W&B config once the engine is fully initialized.
This should keep overhead minimal while adding significant value (experiment settings) for users using distributed learning.