Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib; DreamerV3] How to evaluate the model during training. #48533

Open
liqiangsz opened this issue Nov 4, 2024 · 2 comments
Open

[RLlib; DreamerV3] How to evaluate the model during training. #48533

liqiangsz opened this issue Nov 4, 2024 · 2 comments
Assignees
Labels
docs An issue or change related to documentation P2 Important issue, but not time-critical rllib RLlib related issues rllib-algorithms An RLlib algorithm/Trainer is not learning. rllib-evaluation Bug affecting policy evaluation with RLlib.

Comments

@liqiangsz
Copy link

liqiangsz commented Nov 4, 2024

I can obtain episode reward mean from the train result, but the fluctuation is very large, and it is difficult to judge when to stop the training iteration, so I hope to use the result of evaluate.

I tried two methods, but both failed (ray=2.38.0).
Method 1 uses evaluation config.
** error message (Same as #47527?) **

  File "/home/leeg/anaconda3/envs/tsc_rl_lp_dreamer/lib/python3.10/site-packages/ray/rllib/algorithms/dreamerv3/utils/env_runner.py", line 555, in set_state
    self.module.set_state(state[COMPONENT_RL_MODULE][DEFAULT_MODULE_ID])
KeyError: 'rl_module'

** code **

alg_config = (
        DreamerV3Config()
        # set the env to the pre-registered string
        .environment("tsc_dreamer")
        .env_runners(
            num_env_runners=22, 
            sample_timeout_s=600,
            rollout_fragment_length=144) 
        # play around with the insanely high number of hyperparameters for DreamerV3 ;)
        .training(
            model_size=args.model_size,
            training_ratio=args.training_ratio,
            batch_size_B=16),           
        )
        .learners(
            num_learners=0,
            num_gpus_per_learner=1,
        )
        .api_stack(
            enable_rl_module_and_learner=True,
            enable_env_runner_and_connector_v2=True,
        )
        .evaluation(
            evaluation_num_env_runners=0,
            evaluation_interval=5,
            evaluation_parallel_to_training = False,
            evaluation_sample_timeout_s=600,
            evaluation_duration=22,
            ) 
    )

Method 2 uses evaluate function.
** error message **

  File "/home/leeg/anaconda3/envs/tsc_rl_lp_dreamer/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 1426, in _env_runner_remote
    num_episodes=(num[worker.worker_index] if unit == "episodes" else None),
AttributeError: 'DreamerV3EnvRunner' object has no attribute 'worker_index'

** code **

    for i in range(cfg['training']['num_epoch']):  
        result_train = algo.train()  # 3. train it,        

        if 'env_runners' in result_train and 'episode_return_mean' in result_train['env_runners']:
            episode_reward_mean = result_train['env_runners']['episode_return_mean']        
            logger.info("Iter: %d, episode_reward_mean=%.3f" % (i+1, episode_reward_mean))            
            
            result_eval = algo.evaluate()
@liqiangsz liqiangsz added docs An issue or change related to documentation triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Nov 4, 2024
@liqiangsz liqiangsz changed the title [<Ray component: Core|RLlib|etc...>] [DreamerV3] How to evaluate the model during training. [<Ray component: DreamerV3>] How to evaluate the model during training. Nov 4, 2024
@jcotant1
Copy link
Member

jcotant1 commented Nov 4, 2024

Hi @liqiangsz - can you please share a bit more info on your issue? This will help ensure we triage to the correct team. Thanks!

@jcotant1 jcotant1 added the rllib RLlib related issues label Nov 4, 2024
@liqiangsz
Copy link
Author

Thank you for your prompt feedback.
I updated my issue description.

@sven1977 sven1977 changed the title [<Ray component: DreamerV3>] How to evaluate the model during training. [RLlib; DreamerV3] How to evaluate the model during training. Dec 5, 2024
@simonsays1980 simonsays1980 added rllib-algorithms An RLlib algorithm/Trainer is not learning. rllib-evaluation Bug affecting policy evaluation with RLlib. P2 Important issue, but not time-critical and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs An issue or change related to documentation P2 Important issue, but not time-critical rllib RLlib related issues rllib-algorithms An RLlib algorithm/Trainer is not learning. rllib-evaluation Bug affecting policy evaluation with RLlib.
Projects
None yet
Development

No branches or pull requests

4 participants