Add EnvPoolVectorEnv wrapper for proper envpool integration#1294
Add EnvPoolVectorEnv wrapper for proper envpool integration#1294Lidang-Jiang wants to merge 5 commits into
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a77dc3c985
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| ) -> gym_new_venv_step_type: | ||
| if action is None: | ||
| raise ValueError("action must be not-None for EnvPoolVectorEnv (non-async)") | ||
| obs, rew, terminated, truncated, info = self._env.step(action) |
There was a problem hiding this comment.
Forward selected env ids to envpool step
Collector._collect passes a subset of ready environment ids into env.step(...) during episode-based collection, but this wrapper ignores id and always calls self._env.step(action) with no env_id. EnvPool’s API expects step(action, env_id) for subset stepping, so when ready_env_ids_R is not the full [0..N-1] set this can step the wrong workers (or fail on batch-size mismatch) while the collector still writes transitions under the provided buffer_ids, corrupting per-env trajectories.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
The wrapper's step(id=...) path already uses send(action, env_id=env_id) + recv() (lines 570-573) to forward the selected env ids to envpool. In sync mode (is_async=False), recv() returns the results for exactly the envs that were send()-ed, so the trajectories should align correctly with the collector's buffer_ids.
That said, for async mode (which this wrapper doesn't support — see self.is_async = False in __init__), recv() could return results for different envs. But since async collection isn't wired through this wrapper, I believe the current implementation is correct for the supported use case.
| warnings.warn("Single environment detected, wrap to DummyVectorEnv.") | ||
| # Unfortunately, mypy seems to ignore the isinstance in lambda, maybe a bug in mypy | ||
| env = DummyVectorEnv([lambda: env]) # type: ignore | ||
| elif not isinstance(env, BaseVectorEnv) and hasattr(env, "__len__"): |
There was a problem hiding this comment.
Restrict auto-wrap to real envpool objects
This condition now wraps any non-BaseVectorEnv object that merely defines __len__ into EnvPoolVectorEnv, but that wrapper assumes envpool-specific structure (notably config["num_envs"] in __len__). As a result, custom gym/vector envs with __len__ for unrelated semantics are misclassified and can fail during collector initialization when buffer sizing calls len(env). The wrap trigger should detect envpool-specific interfaces instead of using hasattr(env, "__len__") alone.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Agreed — hasattr(env, '__len__') is too broad. Fixed in 837f1b6: replaced with _is_envpool_env() that checks for envpool-specific interfaces (config['num_envs'], send(), recv()). Applied to both Collector.__init__ and VectorEnvWrapper.__init__.
|
@Lidang-Jiang please document all changes in the change log (here and in your other PRs). |
|
Thanks for the heads-up. I've added CHANGELOG.md entries to all four PRs:
Apologies for missing the PR template checklist — will follow it going forward. |
Instead of passing raw envpool envs where BaseVectorEnv is expected (which relied on interfaces accidentally coinciding), introduce EnvPoolVectorEnv that properly adapts envpool envs to the BaseVectorEnv interface. Key changes: - New EnvPoolVectorEnv class in tianshou/env/venvs.py - Auto-wrapping in Collector and VectorEnvWrapper constructors - Remove info-format hacks from Collector - 7 new unit tests for the wrapper Closes thu-ml#1096 Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>
- EnvPoolVectorEnv.step() now uses send(action, env_id)/recv() for subset stepping, fixing the hang in test_a2c_with_il where the collector passed partial actions without env_id - Replace np.isscalar(id) with isinstance(id, int) for mypy arg-type - Add type: ignore comments for intentional type-boundary crossings (envpool envs are not BaseVectorEnv but accepted at runtime) - Add send/recv to mock envpool envs in tests - Widen run_align_norm_obs param type to BaseVectorEnv Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>
Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>
- Remove return statement in set_env_attr (no return value expected) - Remove unused type: ignore comment on isinstance check Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>
The previous auto-wrap condition `hasattr(env, '__len__')` was too broad and could misclassify custom gym/vector envs as envpool envs. Replace with `_is_envpool_env()` that checks for envpool-specific interfaces: `config['num_envs']`, `send()`, and `recv()`. Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>
837f1b6 to
6d6bdc5
Compare
Summary
Fixes #1096 — instead of passing raw envpool envs where
BaseVectorEnvis expected (which relied on interfaces accidentally coinciding), introducesEnvPoolVectorEnvthat properly adapts envpool envs.Problem
Raw envpool envs return info as a single dict with array values, while tianshou's
BaseVectorEnvreturns an array of dicts. The old code used inline hacks inCollector, which was fragile and violated the type contract.Solution
New:
EnvPoolVectorEnv(tianshou/env/venvs.py)BaseVectorEnvenv_idinto per-env info dictsseed(),close(),render(),get_env_attr(),set_env_attr()Cleanup:
Collector.reset_env()andCollector._collect()Collector.__init__andVectorEnvWrapper.__init__Before (master branch)
After (this branch)
Test plan
pytest test/base/ -x— 156 passed (7 new tests)ruff checkpassesCloses #1096