Skip to content

AWAC doesn't profit from offline data #166

Open
@im-Kitsch

Description

@im-Kitsch

Hi,

@anair13 , it's nice that we can get the code, seems you answer AWAC questions frequently, so I just directly make "@" to you.

In AWAC paper the main benifit is that switching from offline-training to online training there is no "dip" of the performance. But when I run it on mujoco-gym environment, it doesn't get benifit from the pre-training on offline dataset.

  • HalfCheetah, it learns nothing , the episode returns are almost always below zero.
  • Ant, it performs nearly expert performance after switching from offline to online, but it have a huge dip to nearly zero.
  • Walker2d, it also has a dip.

I run the code in repo examples/awac/mujoco/awac1.py with all default settings, seems pretraining on offline data doesn't help these experiments. I find this link in issues(https://drive.google.com/file/d/1Qy5SYIGNwdeTHAGNjbRfuP5pSiRw8JzJ/view), looks in this file the leraning processs also doesn't profit much from the offline-learning.

Do I have to change any hyperparameter? If would be really super nice if I can reproduce the paper result.

Looking forward to your reply.

Best.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions