Open
Description
Hi all, I have run DDPG with default hyperparameters in mujoco swimmer-v2 environment, but the reward converges to a very low value, only 4 or 5, so the swimmer cannot swim at all. I did not change the code, and run with the script: python -m baselines.run --alg=ddpg --env=Swimmer-v2 --num_timesteps=1e6 . I don't know where is wrong. Thank you for your help.
Metadata
Metadata
Assignees
Labels
No labels