I couldn't get good result for GAIL in any environments except HalfCheetah. #204

slee01 · 2019-08-30T07:39:10Z

Hi, first of all, thank you for sharing your code.

I've been trying to implement GAIL using expert demonstrations from your Google Drive. I used the hyper-parameters from gail_experts/readme and I got good result from HalfCheetah. But, I got bad result than I expected from others such as Hopper, Ant, Walker2d(I coudn't test for Reacher. I guess the expert data, which is only 240KB has some problem.) I tried again with different hyper-parameters including seed, but unfortunately still got the same result. So could you share the parameters you used for these environments I failed? It would help comparison test for my research a lot.

ikostrikov · 2019-09-01T16:52:05Z

For the moment, the easiest way to fix the problem is to change the reward function and turn normalization off:
https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail/blob/master/a2c_ppo_acktr/algo/gail.py#L98

See the comments here:
https://github.com/openai/imitation/blob/99fbccf3e060b6e6c739bdf209758620fcdefd3c/policyopt/imitation.py#L146

You need to use this reward specifically:

rewards_B = -tensor.log(1.-tensor.nnet.sigmoid(scores_B))

slee01 · 2019-09-02T11:56:29Z

This was very helpful to me.

I figured out the standard deviation of reward from discriminator is much higher than that from mujoco simulators.

I also understood that the reward range should be different depending on the episode end option.

I finally got good results after modified the reward function.

But I'm not sure why the value network can be trained without reward normalization.

And I'm wondering that there is some reason why you normalize the reward from discriminator knowing the standard deviation of that reward is too high.

I think clipping is more proper than normalization for the reward function in discriminator.

Could you comment on these questions, please?

Thanks!

wang88256187 · 2019-11-28T09:33:50Z

hi, I meet similar problem, my results is always bad in the GAIL. Can you share your experiences on this problem in detail? Thank you very much!

slee01 changed the title ~~I couldn't get good result than I expected other than HalfCheetah using GAIL.~~ I couldn't get good result other than HalfCheetah using GAIL. Aug 30, 2019

slee01 changed the title ~~I couldn't get good result other than HalfCheetah using GAIL.~~ I couldn't get good result in any environments except HalfCheetah. Aug 30, 2019

slee01 changed the title ~~I couldn't get good result in any environments except HalfCheetah.~~ I couldn't get good result for GAIL in any environments except HalfCheetah. Aug 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I couldn't get good result for GAIL in any environments except HalfCheetah. #204

I couldn't get good result for GAIL in any environments except HalfCheetah. #204

slee01 commented Aug 30, 2019

ikostrikov commented Sep 1, 2019

slee01 commented Sep 2, 2019

wang88256187 commented Nov 28, 2019

I couldn't get good result for GAIL in any environments except HalfCheetah. #204

I couldn't get good result for GAIL in any environments except HalfCheetah. #204

Comments

slee01 commented Aug 30, 2019

ikostrikov commented Sep 1, 2019

slee01 commented Sep 2, 2019

wang88256187 commented Nov 28, 2019