feature(pu): unizero and muzero multitask ddp pipeline #300

puyuan1996 · 2024-11-29T08:56:19Z

add unizero and muzero multitask ddp pipeline

… unizero

…e of unizero

…value

… into dev-efficiency

…ightZero into dev-mz-multitask-ddp

PaParaZz1 · 2024-12-19T09:08:28Z

zoo/atari/envs/atari_lightzero_env.py

@@ -156,6 +158,7 @@ def step(self, action: int) -> BaseEnvTimestep:
        observation = self.observe()
        if done:
            info['eval_episode_return'] = self._eval_episode_return
+            print(f'one episode of {self.cfg.env_id} done')


use logging or remove this line

PaParaZz1 · 2024-12-19T09:08:52Z

zoo/atari/config/atari_unizero_multitask_segment_finetune_config.py

+
+        configs = generate_configs(env_id_list, action_space_size, collector_env_num, n_episode, evaluator_env_num, num_simulations, reanalyze_ratio, batch_size, num_unroll_steps, infer_context_length, norm_type, seed, buffer_reanalyze_freq, reanalyze_batch_size, reanalyze_partition, num_segments, total_batch_size)
+
+        pretrained_model_path = '/mnt/afs/niuyazhe/code/LightZero/data_unizero_mt_ddp-8gpu_1127/8games_brf0.02_nlayer8-nhead24_seed1/8games_brf0.02_1-encoder-LN-res2-channel256_gsl20_8-pred-head_lsd768-nlayer8-nh24_mbs-512-bs64_upc80_seed1/Pong_unizero-mt_seed1/ckpt/iteration_200000.pth.tar'


remove the real file path

PaParaZz1 · 2024-12-19T09:09:09Z

zoo/atari/config/atari_unizero_multitask_segment_finetune_config.py

+    from ding.utils import DDPContext
+    from easydict import EasyDict
+
+    env_id_list = ['PongNoFrameskip-v4']  # Debug setup


remove debug settings

PaParaZz1 · 2024-12-19T09:11:02Z

lzero/worker/muzero_collector.py

        """
+        self.task_id = task_id


self._task_id

PaParaZz1 · 2024-12-19T09:12:53Z

lzero/entry/train_unizero_multitask_segment_ddp.py

+    """
+    try:
+        print(f"=========评估开始 Rank {rank}/{world_size}===========")
+        # 重置 stop_event，确保每次评估前都处于未设置状态


English comments

…z-multitask-ddp

puyuan1996 and others added 30 commits July 5, 2024 16:32

feature(pu): add UniZero multitask related pipeline

dd2c95c

polish(pu): polish unizero_multitask config

8769a5c

fix(pu): fix empty_keys_values in init_infer

c342ce1

feature(pu): add softmoe head option in unizero_multitask

6eb772a

fix(pu): fix unizero reset in muzero_collector

71f55b4

polish(pu): polish unizero-multitask config

445fd70

fix(pu): fix replay ratio

4954581

feature(pu): add moe option of feedforward in transformer backbone

44304bf

feature(pu): add value_priority in unizero_multitask

d6be21a

polish(pu): polish value_priority in unizero_multitask

fde51cc

sync code

b460d2f

fix(pu): fix moe in feedforward layer of transformer and polish configs

5117459

feature(pu): add mistralai moe in transformer feedforward and head of…

2495d60

… unizero

polish(pu): polish quantize_state_hash and deepcopy

95886bd

fix(pu): fix np.array dtype bug in buffer

0e49a30

polish(pu): use 0 deepcopy in kv_cache operation in collect/eval phas…

00147f4

…e of unizero

polish(pu): use custom deepcopy for kv_cache

b40c71b

polish(pu): use value_array rather than value_list in compute_target_…

2cc81be

…value

polish(pu): optimize compute_target_policy_non_re

bc5332f

polish(pu): optimize kv_caching update()

a6c6a8e

polish(pu): kv_cache_dict no to_cpu

b5dcdcc

polish(pu): optimize custom kv_cache copy

5b0cbd4

polish(pu): kv_cache_dict no to_cpu

0035829

feature(pu): add unizero ddp config

043727b

fix(pu): fix unizero ddp

d568008

sync code

d349137

polish(pu): use de kv_cacheepcopy only in recur_infer load

3a344aa

Merge branch 'dev-efficiency' of https://github.com/opendilab/LightZero…

40053f7

… into dev-efficiency

sync code

61a1139

polish(pu): polish suz dmc config

0e545c7

dyyoungg and others added 20 commits November 15, 2024 20:00

fix(pu): use stop_event to quit eval() when timeout in eval_async

eeaf986

polish(pu): polish unizero_mt configs

0fb4263

sync code

aaa2793

sync code

13fbe4c

feature(pu): add muzero multitask (and its ddp version) pipeline

d5842f1

polish(pu): polish configs

7723f13

polish(pu): polish config

f1e8d8c

polish(pu): polish config

62c8a96

Merge branch 'dev-mz-multitask-ddp' of https://github.com/opendilab/L…

99c08e2

…ightZero into dev-mz-multitask-ddp

fix(pu): fix embed dim in uz_multitask pipeline

1edcba3

feature(pu): add uz finetune config

4b195eb

feature(pu): add uz eval-tsne config

13183e7

fix(pu): add uz eval-tsne config

1e37cae

polish(pu): polish tsne-plot legend

29298e6

tmp: sync code

bb208b8

polish(pu): polish atari multitask related configs

ffdf4db

polish(pu): polish unizero/muzero multitask related entry

2b0af34

polish(pu): delete unused files

0bd688e

Merge remote-tracking branch 'origin/main' into dev-mz-multitask-ddp

69a1842

polish(pu): polish policy/model in multitask settings

d06ce61

puyuan1996 changed the title ~~TMP: feature(pu): unizero and muzero multitask ddp pipeline~~ feature(pu): unizero and muzero multitask ddp pipeline Dec 18, 2024

puyuan1996 added enhancement New feature or request efficiency optimization Efficiency optimization (time, memory and so on) config New or improved configuration research Research work in progress labels Dec 18, 2024

PaParaZz1 requested changes Dec 19, 2024

View reviewed changes

This was referenced Dec 20, 2024

feature(pu): add UniZero multitask related pipeline #241

Closed

TMP: feature(pu): unizero multitask ddp pipeline #299

Closed

puyuan1996 added 2 commits December 26, 2024 15:08

Merge remote-tracking branch 'origin/dev-mz-multitask-ddp' into dev-m…

2ebeff1

…z-multitask-ddp

feature(pu): add eval_offline option in unizero multitask pipeline

c007917

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature(pu): unizero and muzero multitask ddp pipeline #300

feature(pu): unizero and muzero multitask ddp pipeline #300

puyuan1996 commented Nov 29, 2024 •

edited

Loading

PaParaZz1 Dec 19, 2024

PaParaZz1 Dec 19, 2024

PaParaZz1 Dec 19, 2024

PaParaZz1 Dec 19, 2024

PaParaZz1 Dec 19, 2024


		configs = generate_configs(env_id_list, action_space_size, collector_env_num, n_episode, evaluator_env_num, num_simulations, reanalyze_ratio, batch_size, num_unroll_steps, infer_context_length, norm_type, seed, buffer_reanalyze_freq, reanalyze_batch_size, reanalyze_partition, num_segments, total_batch_size)

		pretrained_model_path = '/mnt/afs/niuyazhe/code/LightZero/data_unizero_mt_ddp-8gpu_1127/8games_brf0.02_nlayer8-nhead24_seed1/8games_brf0.02_1-encoder-LN-res2-channel256_gsl20_8-pred-head_lsd768-nlayer8-nh24_mbs-512-bs64_upc80_seed1/Pong_unizero-mt_seed1/ckpt/iteration_200000.pth.tar'

feature(pu): unizero and muzero multitask ddp pipeline #300

Are you sure you want to change the base?

feature(pu): unizero and muzero multitask ddp pipeline #300

Conversation

puyuan1996 commented Nov 29, 2024 • edited Loading

PaParaZz1 Dec 19, 2024

Choose a reason for hiding this comment

PaParaZz1 Dec 19, 2024

Choose a reason for hiding this comment

PaParaZz1 Dec 19, 2024

Choose a reason for hiding this comment

PaParaZz1 Dec 19, 2024

Choose a reason for hiding this comment

PaParaZz1 Dec 19, 2024

Choose a reason for hiding this comment

puyuan1996 commented Nov 29, 2024 •

edited

Loading