[RLlib; DreamerV3] How to evaluate the model during training. #48533

liqiangsz · 2024-11-04T04:59:27Z

I can obtain episode reward mean from the train result, but the fluctuation is very large, and it is difficult to judge when to stop the training iteration, so I hope to use the result of evaluate.

I tried two methods, but both failed （ray=2.38.0).
Method 1 uses evaluation config.
** error message (Same as #47527?) **

  File "/home/leeg/anaconda3/envs/tsc_rl_lp_dreamer/lib/python3.10/site-packages/ray/rllib/algorithms/dreamerv3/utils/env_runner.py", line 555, in set_state
    self.module.set_state(state[COMPONENT_RL_MODULE][DEFAULT_MODULE_ID])
KeyError: 'rl_module'

** code **

alg_config = (
        DreamerV3Config()
        # set the env to the pre-registered string
        .environment("tsc_dreamer")
        .env_runners(
            num_env_runners=22, 
            sample_timeout_s=600,
            rollout_fragment_length=144) 
        # play around with the insanely high number of hyperparameters for DreamerV3 ;)
        .training(
            model_size=args.model_size,
            training_ratio=args.training_ratio,
            batch_size_B=16),           
        )
        .learners(
            num_learners=0,
            num_gpus_per_learner=1,
        )
        .api_stack(
            enable_rl_module_and_learner=True,
            enable_env_runner_and_connector_v2=True,
        )
        .evaluation(
            evaluation_num_env_runners=0,
            evaluation_interval=5,
            evaluation_parallel_to_training = False,
            evaluation_sample_timeout_s=600,
            evaluation_duration=22,
            ) 
    )

Method 2 uses evaluate function.
** error message **

  File "/home/leeg/anaconda3/envs/tsc_rl_lp_dreamer/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 1426, in _env_runner_remote
    num_episodes=(num[worker.worker_index] if unit == "episodes" else None),
AttributeError: 'DreamerV3EnvRunner' object has no attribute 'worker_index'

** code **

    for i in range(cfg['training']['num_epoch']):  
        result_train = algo.train()  # 3. train it,        

        if 'env_runners' in result_train and 'episode_return_mean' in result_train['env_runners']:
            episode_reward_mean = result_train['env_runners']['episode_return_mean']        
            logger.info("Iter: %d, episode_reward_mean=%.3f" % (i+1, episode_reward_mean))            
            
            result_eval = algo.evaluate()

The text was updated successfully, but these errors were encountered:

jcotant1 · 2024-11-04T16:12:30Z

Hi @liqiangsz - can you please share a bit more info on your issue? This will help ensure we triage to the correct team. Thanks!

liqiangsz · 2024-11-05T00:22:04Z

Thank you for your prompt feedback.
I updated my issue description.

liqiangsz added docs An issue or change related to documentation triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Nov 4, 2024

liqiangsz changed the title ~~[<Ray component: Core|RLlib|etc...>] [DreamerV3] How to evaluate the model during training.~~ [<Ray component: DreamerV3>] How to evaluate the model during training. Nov 4, 2024

jcotant1 added the rllib RLlib related issues label Nov 4, 2024

sven1977 changed the title ~~[<Ray component: DreamerV3>] How to evaluate the model during training.~~ [RLlib; DreamerV3] How to evaluate the model during training. Dec 5, 2024

simonsays1980 added rllib-algorithms An RLlib algorithm/Trainer is not learning. rllib-evaluation Bug affecting policy evaluation with RLlib. P2 Important issue, but not time-critical and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Dec 5, 2024

simonsays1980 assigned simonsays1980 and sven1977 and unassigned simonsays1980 Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib; DreamerV3] How to evaluate the model during training. #48533

[RLlib; DreamerV3] How to evaluate the model during training. #48533

liqiangsz commented Nov 4, 2024 •

edited by simonsays1980

Loading

jcotant1 commented Nov 4, 2024

liqiangsz commented Nov 5, 2024

[RLlib; DreamerV3] How to evaluate the model during training. #48533

[RLlib; DreamerV3] How to evaluate the model during training. #48533

Comments

liqiangsz commented Nov 4, 2024 • edited by simonsays1980 Loading

jcotant1 commented Nov 4, 2024

liqiangsz commented Nov 5, 2024

liqiangsz commented Nov 4, 2024 •

edited by simonsays1980

Loading