Open
Description
I reproduced the results of the original paper on locomotion, but encountered some difficulties on maze2d. I would like to ask:
- Is the score given in the original paper normalized?
- The diffuser step given in the appendix only mentions locomotion and block-stacking, is maze2d also 100?
- The paper says to use start and goal locations, is the goal location obtained through the environment’s get_target and reset_to_location? Is the start location referring to the current observation of the agent? Is the goal location placed at horizon - 1? After setting it up this way, the result I reproduced only got a non-normalized score of 87.
Metadata
Assignees
Labels
No labels