[QUESTION] Using LSTMs with vectorized envs without ParallelEnv

Hi there,

As mentioned [here](https://github.com/NVIDIA-Omniverse/Orbit/discussions/108#discussion-5511668), I'm trying to use torchrl with [NVIDIA Orbit](https://github.com/NVIDIA-Omniverse/Orbit) to train an agent in parallel robot environments. I tried to draw inspiration from your recent IsaacGymEnv class to create a simple [OrbitEnv](https://gist.github.com/ADebor/6ac7b59889fa3babe08fd3bed334f826) class, inheriting from GymEnv directly as Orbit environments are registered in gym (made sense to me, but I may be wrong). I'm thus able to [create a torchrl environment and add transforms](https://gist.github.com/ADebor/6ac7b59889fa3babe08fd3bed334f826), but I get in trouble when trying to use ("parallel") RNNs and thus hidden states.

Since Orbit environments are vectorized environments, I only create one torchrl environment wrapping the orbit one and I set the batch_size equal to the number of environments in the Orbit vectorized one. As I want to use LSTMs, I add the make_tensordict_primer() transform to my environment. If I then try to reset my environment, I get the following error:

```
RuntimeError: batch dimension mismatch, got self.batch_size=torch.Size([2]) and value.shape[:self.batch_dims]=torch.Size([1]) with value tensor([[0., 0., 0., 0., 0.]])
```

where 2 is the number of parallel environments and the value tensor is a hidden state one (of length 5). I don't get what the problem is exactly, but it looks like the batch size is not taken into account when it comes to hidden states. When trying with a basic gym environment, I wrap my environment within a ParallelEnv instance and I have no issue, but in this case, using ParallelEnv causes a problem since (as I get it) torchrl tries to create multiple parallel and separate environments, which seems to be in conflict with Orbit where all environments are in the same scene in Isaac Sim.

Can you see a way to properly use LSTMs in torchrl while using a vectorized environment from Orbit? Don't hesitate to tell me if the question or context isn't clear. I apologize if I'm out of line here, I'm maybe missing something important in the way torchrl is supposed to be used.

---

Here is the complete stack trace:

```
Traceback (most recent call last):
  File "test.py", line 61, in <module>
    td = env.reset()
  File "/home/adebor/isaacsim_ws/rl/torchrl/envs/common.py", line 949, in reset
    tensordict_reset = self._reset(tensordict, **kwargs)
  File "/home/adebor/isaacsim_ws/rl/torchrl/envs/transforms/transforms.py", line 651, in _reset
    out_tensordict = self.transform.reset(out_tensordict)
  File "/home/adebor/isaacsim_ws/rl/torchrl/envs/transforms/transforms.py", line 889, in reset
    tensordict = t.reset(tensordict)
  File "/home/adebor/isaacsim_ws/rl/torchrl/envs/transforms/transforms.py", line 3016, in reset
    tensordict.set(key, value)
  File "/home/adebor/isaacsim_ws/tensordict/tensordict/tensordict.py", line 761, in set
    return self._set_tuple(key, item, inplace=inplace, validated=False)
  File "/home/adebor/isaacsim_ws/tensordict/tensordict/tensordict.py", line 4122, in _set_tuple
    return self._set_str(key[0], value, inplace=inplace, validated=validated)
  File "/home/adebor/isaacsim_ws/tensordict/tensordict/tensordict.py", line 4093, in _set_str
    value = self._validate_value(value, check_shape=True)
  File "/home/adebor/isaacsim_ws/tensordict/tensordict/tensordict.py", line 1732, in _validate_value
    f"batch dimension mismatch, got self.batch_size"
RuntimeError: batch dimension mismatch, got self.batch_size=torch.Size([2]) and value.shape[:self.batch_dims]=torch.Size([1]) with value tensor([[0., 0., 0., 0., 0.]])
```

---
Versions:
torchrl==0.1.1
torch==1.13.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[QUESTION] Using LSTMs with vectorized envs without ParallelEnv #1493

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[QUESTION] Using LSTMs with vectorized envs without ParallelEnv #1493

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions