Skip to content

Commit 3cc6253

Browse files
author
Vincent Moens
committed
Update
[ghstack-poisoned]
2 parents 7827a41 + 13238ad commit 3cc6253

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

64 files changed

+567
-224
lines changed

.github/unittest/linux/scripts/environment.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,8 +24,8 @@ dependencies:
2424
- tensorboard
2525
- imageio==2.26.0
2626
- wandb
27-
- dm_control<1.0.21
28-
- mujoco<3.2.1
27+
- dm_control
28+
- mujoco
2929
- mlflow
3030
- av
3131
- coverage

.github/unittest/linux/scripts/run_all.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@ echo "installing gymnasium"
9191
pip3 install "gymnasium"
9292
pip3 install ale_py
9393
pip3 install mo-gymnasium[mujoco] # requires here bc needs mujoco-py
94-
pip3 install "mujoco<3.2.1" -U
94+
pip3 install "mujoco" -U
9595

9696
# sanity check: remove?
9797
python3 -c """

.github/unittest/linux_distributed/scripts/environment.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,8 @@ dependencies:
2323
- tensorboard
2424
- imageio==2.26.0
2525
- wandb
26-
- dm_control<1.0.21
27-
- mujoco<3.2.1
26+
- dm_control
27+
- mujoco
2828
- mlflow
2929
- av
3030
- coverage

.github/unittest/linux_examples/scripts/environment.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,8 @@ dependencies:
2121
- scipy
2222
- hydra-core
2323
- imageio==2.26.0
24-
- dm_control<1.0.21
25-
- mujoco<3.2.1
24+
- dm_control
25+
- mujoco
2626
- mlflow
2727
- av
2828
- coverage

.github/unittest/linux_libs/scripts_envpool/environment.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,6 @@ dependencies:
1818
- expecttest
1919
- pyyaml
2020
- scipy
21-
- dm_control<1.0.21
22-
- mujoco<3.2.1
21+
- dm_control
22+
- mujoco
2323
- coverage

.github/unittest/linux_olddeps/scripts_gym_0_13/environment.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ dependencies:
2222
- scipy
2323
- hydra-core
2424
- dm_control -e git+https://github.com/deepmind/dm_control.git@c053360edea6170acfd9c8f65446703307d9d352#egg={dm_control}
25-
- mujoco<3.2.1
25+
- mujoco
2626
- patchelf
2727
- pyopengl==3.1.4
2828
- ray

.github/workflows/benchmarks.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ jobs:
3535
python3 setup.py develop
3636
python3 -m pip install pytest pytest-benchmark
3737
python3 -m pip install "gym[accept-rom-license,atari]"
38-
python3 -m pip install "dm_control<1.0.21" "mujoco<3.2.1"
38+
python3 -m pip install "dm_control" "mujoco"
3939
export TD_GET_DEFAULTS_TO_NONE=1
4040
- name: Run benchmarks
4141
run: |
@@ -97,7 +97,7 @@ jobs:
9797
python3 setup.py develop
9898
python3 -m pip install pytest pytest-benchmark
9999
python3 -m pip install "gym[accept-rom-license,atari]"
100-
python3 -m pip install "dm_control<1.0.21" "mujoco<3.2.1"
100+
python3 -m pip install "dm_control" "mujoco"
101101
export TD_GET_DEFAULTS_TO_NONE=1
102102
- name: check GPU presence
103103
run: |

.github/workflows/benchmarks_pr.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ jobs:
3434
python3 setup.py develop
3535
python3 -m pip install pytest pytest-benchmark
3636
python3 -m pip install "gym[accept-rom-license,atari]"
37-
python3 -m pip install "dm_control<1.0.21" "mujoco<3.2.1"
37+
python3 -m pip install "dm_control" "mujoco"
3838
export TD_GET_DEFAULTS_TO_NONE=1
3939
- name: Setup benchmarks
4040
run: |
@@ -108,7 +108,7 @@ jobs:
108108
python3 setup.py develop
109109
python3 -m pip install pytest pytest-benchmark
110110
python3 -m pip install "gym[accept-rom-license,atari]"
111-
python3 -m pip install "dm_control<1.0.21" "mujoco<3.2.1"
111+
python3 -m pip install "dm_control" "mujoco"
112112
export TD_GET_DEFAULTS_TO_NONE=1
113113
- name: check GPU presence
114114
run: |

.github/workflows/wheels-legacy.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ on:
55
push:
66
branches:
77
- release/*
8+
- main
89

910
concurrency:
1011
# Documentation suggests ${{ github.head_ref }}, but that's only available on pull_request/pull_request_target triggers, so using ${{ github.ref }}.

docs/requirements.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,8 @@ docutils
1414
sphinx_design
1515

1616
torchvision
17-
dm_control<1.0.21
18-
mujoco<3.2.1
17+
dm_control
18+
mujoco
1919
atari-py
2020
ale-py
2121
gym[classic_control,accept-rom-license]

docs/source/reference/collectors.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ worker) may also impact the memory management. The key parameters to control are
4545
:obj:`devices` which controls the execution devices (ie the device of the policy)
4646
and :obj:`storing_device` which will control the device where the environment and
4747
data are stored during a rollout. A good heuristic is usually to use the same device
48-
for storage and compute, which is the default behaviour when only the `devices` argument
48+
for storage and compute, which is the default behavior when only the `devices` argument
4949
is being passed.
5050

5151
Besides those compute parameters, users may choose to configure the following parameters:

docs/source/reference/data.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,7 @@ using the following components:
171171
Storage choice is very influential on replay buffer sampling latency, especially
172172
in distributed reinforcement learning settings with larger data volumes.
173173
:class:`~torchrl.data.replay_buffers.storages.LazyMemmapStorage` is highly
174-
advised in distributed settings with shared storage due to the lower serialisation
174+
advised in distributed settings with shared storage due to the lower serialization
175175
cost of MemoryMappedTensors as well as the ability to specify file storage locations
176176
for improved node failure recovery.
177177
The following mean sampling latency improvements over using :class:`~torchrl.data.replay_buffers.ListStorage`

docs/source/reference/envs.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -318,7 +318,7 @@ have on an environment returning zeros after reset:
318318

319319
We also offer the :class:`~.SerialEnv` class that enjoys the exact same API but is executed
320320
serially. This is mostly useful for testing purposes, when one wants to assess the
321-
behaviour of a :class:`~.ParallelEnv` without launching the subprocesses.
321+
behavior of a :class:`~.ParallelEnv` without launching the subprocesses.
322322

323323
In addition to :class:`~.ParallelEnv`, which offers process-based parallelism, we also provide a way to create
324324
multithreaded environments with :obj:`~.MultiThreadedEnv`. This class uses `EnvPool <https://github.com/sail-sg/envpool>`_
@@ -499,7 +499,7 @@ current episode.
499499
To handle these cases, torchrl provides a :class:`~torchrl.envs.AutoResetTransform` that will copy the observations
500500
that result from the call to `step` to the next `reset` and skip the calls to `reset` during rollouts (in both
501501
:meth:`~torchrl.envs.EnvBase.rollout` and :class:`~torchrl.collectors.SyncDataCollector` iterations).
502-
This transform class also provides a fine-grained control over the behaviour to be adopted for the invalid observations,
502+
This transform class also provides a fine-grained control over the behavior to be adopted for the invalid observations,
503503
which can be masked with `"nan"` or any other values, or not masked at all.
504504

505505
To tell torchrl that an environment is auto-resetting, it is sufficient to provide an ``auto_reset`` argument
@@ -755,10 +755,10 @@ registered buffers:
755755
>>> TransformedEnv(base_env, third_transform.clone()) # works
756756

757757
On a single process or if the buffers are placed in shared memory, this will
758-
result in all the clone transforms to keep the same behaviour even if the
758+
result in all the clone transforms to keep the same behavior even if the
759759
buffers are changed in place (which is what will happen with the :class:`CatFrames`
760760
transform, for instance). In distributed settings, this may not hold and one
761-
should be careful about the expected behaviour of the cloned transforms in this
761+
should be careful about the expected behavior of the cloned transforms in this
762762
context.
763763
Finally, notice that indexing multiple transforms from a :class:`Compose` transform
764764
may also result in loss of parenthood for these transforms: the reason is that
@@ -1061,7 +1061,7 @@ the current gym backend or any of its modules:
10611061
Another tool that comes in handy with gym and other external dependencies is
10621062
the :class:`torchrl._utils.implement_for` class. Decorating a function
10631063
with ``@implement_for`` will tell torchrl that, depending on the version
1064-
indicated, a specific behaviour is to be expected. This allows us to easily
1064+
indicated, a specific behavior is to be expected. This allows us to easily
10651065
support multiple versions of gym without requiring any effort from the user side.
10661066
For example, considering that our virtual environment has the v0.26.2 installed,
10671067
the following function will return ``1`` when queried:

docs/source/reference/modules.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ Exploration wrappers
6262

6363
To efficiently explore the environment, TorchRL proposes a series of wrappers
6464
that will override the action sampled by the policy by a noisier version.
65-
Their behaviour is controlled by :func:`~torchrl.envs.utils.exploration_mode`:
65+
Their behavior is controlled by :func:`~torchrl.envs.utils.exploration_mode`:
6666
if the exploration is set to ``"random"``, the exploration is active. In all
6767
other cases, the action written in the tensordict is simply the network output.
6868

examples/distributed/replay_buffers/distributed_replay_buffer.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -150,8 +150,8 @@ def _create_and_launch_data_collectors(self) -> None:
150150

151151
class ReplayBufferNode(RemoteTensorDictReplayBuffer):
152152
"""Experience replay buffer node that is capable of accepting remote connections. Being a `RemoteTensorDictReplayBuffer`
153-
means all of it's public methods are remotely invokable using `torch.rpc`.
154-
Using a LazyMemmapStorage is highly advised in distributed settings with shared storage due to the lower serialisation
153+
means all of its public methods are remotely invokable using `torch.rpc`.
154+
Using a LazyMemmapStorage is highly advised in distributed settings with shared storage due to the lower serialization
155155
cost of MemoryMappedTensors as well as the ability to specify file storage locations which can improve ability to recover from node failures.
156156
157157
Args:

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -191,7 +191,7 @@ def _main(argv):
191191
# tag = _run_cmd(["git", "describe", "--tags", "--exact-match", "@"])
192192

193193
this_directory = Path(__file__).parent
194-
long_description = (this_directory / "README.md").read_text()
194+
long_description = (this_directory / "README.md").read_text(encoding="utf8")
195195
sys.argv = [sys.argv[0]] + unknown
196196

197197
extra_requires = {

sota-implementations/cql/cql_offline.py

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -58,14 +58,14 @@ def main(cfg: "DictConfig"): # noqa: F821
5858
device = "cpu"
5959
device = torch.device(device)
6060

61+
# Create replay buffer
62+
replay_buffer = make_offline_replay_buffer(cfg.replay_buffer)
63+
6164
# Create env
6265
train_env, eval_env = make_environment(
6366
cfg, train_num_envs=1, eval_num_envs=cfg.logger.eval_envs, logger=logger
6467
)
6568

66-
# Create replay buffer
67-
replay_buffer = make_offline_replay_buffer(cfg.replay_buffer)
68-
6969
# Create agent
7070
model = make_cql_model(cfg, train_env, eval_env, device)
7171
del train_env
@@ -107,9 +107,6 @@ def main(cfg: "DictConfig"): # noqa: F821
107107

108108
q_loss = q_loss + cql_loss
109109

110-
alpha_loss = loss_vals["loss_alpha"]
111-
alpha_prime_loss = loss_vals["loss_alpha_prime"]
112-
113110
# update model
114111
alpha_loss = loss_vals["loss_alpha"]
115112
alpha_prime_loss = loss_vals["loss_alpha_prime"]

sota-implementations/redq/redq.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -159,7 +159,7 @@ def main(cfg: "DictConfig"): # noqa: F821
159159
use_env_creator=False,
160160
)()
161161
if isinstance(create_env_fn, ParallelEnv):
162-
raise NotImplementedError("This behaviour is deprecated")
162+
raise NotImplementedError("This behavior is deprecated")
163163
elif isinstance(create_env_fn, EnvCreator):
164164
recorder.transform[1:].load_state_dict(
165165
get_norm_state_dict(create_env_fn()), strict=False

test/_utils_internal.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ def HALFCHEETAH_VERSIONED():
5656

5757
def PONG_VERSIONED():
5858
# load gym
59-
# Gymnasium says that the ale_py behaviour changes from 1.0
59+
# Gymnasium says that the ale_py behavior changes from 1.0
6060
# but with python 3.12 it is already the case with 0.29.1
6161
try:
6262
import ale_py # noqa
@@ -70,7 +70,7 @@ def PONG_VERSIONED():
7070

7171
def BREAKOUT_VERSIONED():
7272
# load gym
73-
# Gymnasium says that the ale_py behaviour changes from 1.0
73+
# Gymnasium says that the ale_py behavior changes from 1.0
7474
# but with python 3.12 it is already the case with 0.29.1
7575
try:
7676
import ale_py # noqa

0 commit comments

Comments
 (0)