Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(pu): unizero and muzero multitask ddp pipeline #300

Open
wants to merge 135 commits into
base: main
Choose a base branch
from

Conversation

puyuan1996
Copy link
Collaborator

@puyuan1996 puyuan1996 commented Nov 29, 2024

  • add unizero and muzero multitask ddp pipeline

puyuan1996 and others added 30 commits July 5, 2024 16:32
@puyuan1996 puyuan1996 changed the title TMP: feature(pu): unizero and muzero multitask ddp pipeline feature(pu): unizero and muzero multitask ddp pipeline Dec 18, 2024
@puyuan1996 puyuan1996 added enhancement New feature or request efficiency optimization Efficiency optimization (time, memory and so on) config New or improved configuration research Research work in progress labels Dec 18, 2024
@@ -156,6 +158,7 @@ def step(self, action: int) -> BaseEnvTimestep:
observation = self.observe()
if done:
info['eval_episode_return'] = self._eval_episode_return
print(f'one episode of {self.cfg.env_id} done')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use logging or remove this line


configs = generate_configs(env_id_list, action_space_size, collector_env_num, n_episode, evaluator_env_num, num_simulations, reanalyze_ratio, batch_size, num_unroll_steps, infer_context_length, norm_type, seed, buffer_reanalyze_freq, reanalyze_batch_size, reanalyze_partition, num_segments, total_batch_size)

pretrained_model_path = '/mnt/afs/niuyazhe/code/LightZero/data_unizero_mt_ddp-8gpu_1127/8games_brf0.02_nlayer8-nhead24_seed1/8games_brf0.02_1-encoder-LN-res2-channel256_gsl20_8-pred-head_lsd768-nlayer8-nh24_mbs-512-bs64_upc80_seed1/Pong_unizero-mt_seed1/ckpt/iteration_200000.pth.tar'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the real file path

from ding.utils import DDPContext
from easydict import EasyDict

env_id_list = ['PongNoFrameskip-v4'] # Debug setup
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove debug settings

"""
self.task_id = task_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._task_id

"""
try:
print(f"=========评估开始 Rank {rank}/{world_size}===========")
# 重置 stop_event,确保每次评估前都处于未设置状态
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

English comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
config New or improved configuration efficiency optimization Efficiency optimization (time, memory and so on) enhancement New feature or request research Research work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants