Closed
Description
I trained DDPG with command refered in README, which is "python -m baselines.run --alg=ddpg --env=HalfCheetah-v2 --num_timesteps=1e6".
After 1000000 steps, the reward is still negative.
I tried other games, such as "popper", none of this can I get the correct result.
I use the master branch with latest code, and my tensorflow-gpu version is 1.8.0.
Have anyone train DDPG success?
Metadata
Metadata
Assignees
Labels
No labels