Skip to content

Commit

Permalink
Fixes Sb3VecEnvWrapper to clear buffer on reset (#974)
Browse files Browse the repository at this point in the history
# Description

In previous version of the SB3 environment wrapper, the episode buffer
was not cleared when `env.reset` was called. This led to an
overestimation of the number of time-steps and rewards in subsequent
episodes, as reflected in the `infos` returned by `env.steps`. This
commit aims to address this.

## Type of change

- Bug fix (non-breaking change which fixes an issue)

## Checklist

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [ ]  I have made corresponding changes to the documentation
- [x]  My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there
  • Loading branch information
EricJin2002 authored Sep 12, 2024
1 parent b4c9050 commit 5444fa3
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 0 deletions.
1 change: 1 addition & 0 deletions CONTRIBUTORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ Guidelines for modifications:
* Shafeef Omar
* Vladimir Fokow
* Xavier Nal
* Yang Jin
* Zhengyu Zhang
* Ziqi Fan
* Qian Wan
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,9 @@ def seed(self, seed: int | None = None) -> list[int | None]: # noqa: D102

def reset(self) -> VecEnvObs: # noqa: D102
obs_dict, _ = self.env.reset()
# reset episodic information buffers
self._ep_rew_buf.zero_()
self._ep_len_buf.zero_()
# convert data types to numpy depending on backend
return self._process_obs(obs_dict)

Expand Down

0 comments on commit 5444fa3

Please sign in to comment.