fix PPO #2377

hjh0119 · 2024-11-04T01:49:19Z

PR type

Bug Fix
New Feature
Document Updates
More Models or Datasets Support

PR information

related issue:
#2316
#2296
#2330
#2285

Experiment results

Paste your experiment result here(if needed).

Jintao-Huang · 2024-11-04T12:55:49Z

Is it still in WIP status, or can it be reviewed now

hjh0119 · 2024-11-05T15:34:45Z

Is it still in WIP status, or can it be reviewed now

It's still under testing now

hjh0119 · 2024-11-10T03:20:46Z

swift/llm/utils/argument.py

+        from contextlib import nullcontext, contextmanager
+
+        @contextmanager
+        def ppocontext():
+            from transformers.integrations.deepspeed import HfTrainerDeepSpeedConfig
+            from transformers.utils import is_sagemaker_mp_enabled
+            if is_sagemaker_mp_enabled():
+                import smdistributed.modelparallel.torch as smp
+                smp.init()
+            old_trainer_config_process = HfTrainerDeepSpeedConfig.trainer_config_process
+
+            def trainer_config_process(self, args, auto_find_batch_size=False):
+                if args.world_size is None:
+                    if args.distributed_state is not None:
+                        return args.distributed_state.num_processes
+                    elif is_sagemaker_mp_enabled():
+                        return smp.dp_size() if not smp.state.cfg.prescaled_batch else smp.rdp_size()
+                    return 1
+                return old_trainer_config_process(self, args, auto_find_batch_size)
+
+            HfTrainerDeepSpeedConfig.trainer_config_process = trainer_config_process
+            yield
+            HfTrainerDeepSpeedConfig.trainer_config_process = old_trainer_config_process
+
+        context = nullcontext if self.rlhf_type != 'ppo' else ppocontext
+        with context():
+            super().__post_init__()


issue #2296

hjh0119 added 2 commits October 23, 2024 23:41

fix

5c90c79

update

7968586

hjh0119 added 4 commits November 7, 2024 23:40

fix ds

c79ece9

Merge remote-tracking branch 'origin/main' into fix-1023

7ac251e

num_sample_generations

f5b1709

update

7b7c024

hjh0119 changed the title ~~[WIP] fix PPO~~ fix PPO Nov 10, 2024

hjh0119 requested review from Jintao-Huang and tastelikefeet November 10, 2024 03:19

hjh0119 commented Nov 10, 2024

View reviewed changes

rm unused import

15bebfe

Jintao-Huang approved these changes Nov 11, 2024

View reviewed changes

Jintao-Huang merged commit 1e878b9 into modelscope:main Nov 11, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix PPO #2377

fix PPO #2377

hjh0119 commented Nov 4, 2024 •

edited

Loading

Jintao-Huang commented Nov 4, 2024

hjh0119 commented Nov 5, 2024

hjh0119 Nov 10, 2024

fix PPO #2377

fix PPO #2377

Conversation

hjh0119 commented Nov 4, 2024 • edited Loading

PR type

PR information

Experiment results

Jintao-Huang commented Nov 4, 2024

hjh0119 commented Nov 5, 2024

hjh0119 Nov 10, 2024

Choose a reason for hiding this comment

hjh0119 commented Nov 4, 2024 •

edited

Loading