-
Notifications
You must be signed in to change notification settings - Fork 373
Integrating hydra with DQN #201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
examples/ddpg/ddpg.py
Outdated
|
||
actor_model_explore = model[0] | ||
if args.ou_exploration: | ||
if args.gSDE: | ||
if cfg.ou_exploration: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we delete ou_exploration
and gSDE
from DQN? they make no sense here. Ideally they shouldn't even be in the config
…v_dqn # Conflicts: # examples/ddpg/ddpg.py # examples/dqn/dqn.py # examples/redq/redq.py # examples/sac/sac.py
@@ -10,7 +10,6 @@ | |||
import hydra |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure where do those changes in PPO come from in this PR
torchrl/trainers/helpers/models.py
Outdated
@@ -1351,3 +1351,5 @@ class DiscreteModelConfig: | |||
# whether a distributional loss should be used. | |||
atoms: int = 51 | |||
# number of atoms used for the distributional loss | |||
gSDE: bool = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah let's not do that :) No gSDE for discrete actions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed epsilon greedy wrapper which removed the error with action_value being None, is this ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's see what we can do about gSDE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EGreedyWrapper should still be used
examples/dqn/dqn.py
Outdated
@@ -103,17 +101,8 @@ def main(cfg: "DictConfig"): | |||
) | |||
|
|||
loss_module, target_net_updater = make_dqn_loss(model, cfg) | |||
model_explore = EGreedyWrapper(model, annealing_num_steps=cfg.annealing_frames).to( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we still want that, that's the typical exploration technique for dqn
@@ -124,7 +113,7 @@ def main(cfg: "DictConfig"): | |||
|
|||
collector = make_collector_offpolicy( | |||
make_env=create_env_fn, | |||
actor_model_explore=model_explore, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's keep model_explore
Integrated hydra with DQN using structured configs.