[BugFix] Add strict_shape parameter to QValueModule for action shape enforcement by Lidang-Jiang · Pull Request #3593 · pytorch/rl

Lidang-Jiang · 2026-04-04T08:17:59Z

Summary

Fixes #3059 — QValueActor now respects action_spec shape for singleton dimensions.

When using Categorical specs with singleton dimensions (e.g., shape=(1, 1)), argmax(dim=-1) drops the trailing dimension, causing the action shape to not match the spec. This adds a strict_shape parameter to QValueModule and QValueActor following the approach suggested by @vmoens in #3059.

`strict_shape` parameter

Value	Behavior
`None` (default)	`FutureWarning` on shape mismatch (backward compatible)
`"auto"`	Automatically reshape action to match spec
`True`	`RuntimeError` on shape mismatch
`False`	Silently allow mismatch

Before

action_spec = Categorical(4, shape=torch.Size((1, 1)), dtype=torch.int64)
module = TensorDictModule(module=nn.Linear(3, 1), in_keys=("observation",), out_keys=("action_value",))
qvalue_actor = QValueActor(module=module, in_keys=["observation"], spec=action_spec)
td = TensorDict({"observation": torch.randn(12, 3)})
qvalue_actor(td)
print(td["action"].shape)  # torch.Size([12])  ← should be [12, 1]

After

qvalue_actor = QValueActor(module=module, in_keys=["observation"], spec=action_spec, strict_shape="auto")
td = TensorDict({"observation": torch.randn(12, 3)})
qvalue_actor(td)
print(td["action"].shape)  # torch.Size([12, 1])  ← correct!

Test results (55 passed)

test/test_actors.py::TestQValue (55 tests including 4 new)
test_qvalue_actor_strict_shape_auto PASSED
test_qvalue_actor_strict_shape_true_raises PASSED
test_qvalue_actor_strict_shape_none_warns PASSED
test_qvalue_actor_strict_shape_normal_no_warning PASSED
======================= 55 passed, 24 warnings in 1.45s ========================

Test plan

4 new tests: auto reshape, strict raise, default warning, no false positive
All 51 existing QValue tests pass (no regression)

…enforcement When using Categorical specs with singleton dimensions, argmax drops the trailing dim causing action shape to not match the spec. Add strict_shape parameter to QValueModule and QValueActor: - None (default): FutureWarning on shape mismatch - 'auto': automatically reshape action to match spec - True: raise RuntimeError on mismatch - False: silently allow mismatch Fixes pytorch#3059 Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>

pytorch-bot · 2026-04-04T08:18:05Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3593

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 28 New Failures, 1 Cancelled Job, 2 Pending

As of commit e6fcc31 with merge base e2c8a8d ():

NEW FAILURES - The following jobs have failed:

Build Aarch64 Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl, .github/scripts/pre-build-script.sh) / build-wheel-py3_10-cpu-aarch64 (gh)
ModuleNotFoundError: No module named 'packaging.version'
Build Aarch64 Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl, .github/scripts/pre-build-script.sh) / upload / upload-wheel-py3_10-cpu-aarch64 (gh)
Unable to download artifact(s): Artifact not found for name: pytorch_rl__3.10_cpu_aarch64
Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl, .github/scripts/pre-build-script.sh) / build-manywheel-py3_10-cpu (gh)
Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl, .github/scripts/pre-build-script.sh) / build-manywheel-py3_10-cuda12_6 (gh)
ModuleNotFoundError: No module named 'packaging.version'
Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl, .github/scripts/pre-build-script.sh) / build-manywheel-py3_10-cuda12_8 (gh)
ModuleNotFoundError: No module named 'packaging.version'
Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl, .github/scripts/pre-build-script.sh) / build-manywheel-py3_10-cuda13_0 (gh)
Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl, .github/scripts/pre-build-script.sh) / build-manywheel-py3_10-cuda13_2 (gh)
ModuleNotFoundError: No module named 'packaging.version'
Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl, .github/scripts/pre-build-script.sh) / build-manywheel-py3_10-rocm7_1 (gh)
Process completed with exit code 2.
Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl, .github/scripts/pre-build-script.sh) / build-manywheel-py3_10-rocm7_2 (gh)
ModuleNotFoundError: No module named 'packaging.version'
Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl, .github/scripts/pre-build-script.sh) / upload / upload-manywheel-py3_10-cpu (gh)
Unable to download artifact(s): Artifact not found for name: pytorch_rl__3.10_cpu_x86_64
Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl, .github/scripts/pre-build-script.sh) / upload / upload-manywheel-py3_10-cuda12_6 (gh)
Unable to download artifact(s): Artifact not found for name: pytorch_rl__3.10_cu126_x86_64
Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl, .github/scripts/pre-build-script.sh) / upload / upload-manywheel-py3_10-cuda12_8 (gh)
Unable to download artifact(s): Artifact not found for name: pytorch_rl__3.10_cu128_x86_64
Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl, .github/scripts/pre-build-script.sh) / upload / upload-manywheel-py3_10-cuda13_0 (gh)
Unable to download artifact(s): Artifact not found for name: pytorch_rl__3.10_cu130_x86_64
Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl, .github/scripts/pre-build-script.sh) / upload / upload-manywheel-py3_10-cuda13_2 (gh)
Unable to download artifact(s): Artifact not found for name: pytorch_rl__3.10_cu132_x86_64
Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl, .github/scripts/pre-build-script.sh) / upload / upload-manywheel-py3_10-rocm7_1 (gh)
Unable to download artifact(s): Artifact not found for name: pytorch_rl__3.10_rocm7.1_x86_64
Build Linux Wheels / pytorch/rl (pytorch/rl, test/smoke_test.py, torchrl, .github/scripts/pre-build-script.sh) / upload / upload-manywheel-py3_10-rocm7_2 (gh)
Unable to download artifact(s): Artifact not found for name: pytorch_rl__3.10_rocm7.2_x86_64
Build Windows Wheels / pytorch/rl / build-wheel-py3_10-cpu (gh)
ModuleNotFoundError: No module named 'packaging.version'
Build Windows Wheels / pytorch/rl / build-wheel-py3_10-cuda12_6 (gh)
ModuleNotFoundError: No module named 'packaging.version'
Build Windows Wheels / pytorch/rl / build-wheel-py3_10-cuda12_8 (gh)
ModuleNotFoundError: No module named 'packaging.version'
Build Windows Wheels / pytorch/rl / build-wheel-py3_10-cuda13_0 (gh)
ModuleNotFoundError: No module named 'packaging.version'
Build Windows Wheels / pytorch/rl / build-wheel-py3_10-cuda13_2 (gh)
ModuleNotFoundError: No module named 'packaging.version'
Build Windows Wheels / pytorch/rl / upload / upload-wheel-py3_10-cpu (gh)
Unable to download artifact(s): Artifact not found for name: pytorch_rl__3.10_cpu_x64
Build Windows Wheels / pytorch/rl / upload / upload-wheel-py3_10-cuda12_6 (gh)
Unable to download artifact(s): Artifact not found for name: pytorch_rl__3.10_cu126_x64
Build Windows Wheels / pytorch/rl / upload / upload-wheel-py3_10-cuda12_8 (gh)
Unable to download artifact(s): Artifact not found for name: pytorch_rl__3.10_cu128_x64
Build Windows Wheels / pytorch/rl / upload / upload-wheel-py3_10-cuda13_0 (gh)
Unable to download artifact(s): Artifact not found for name: pytorch_rl__3.10_cu130_x64
Build Windows Wheels / pytorch/rl / upload / upload-wheel-py3_10-cuda13_2 (gh)
Unable to download artifact(s): Artifact not found for name: pytorch_rl__3.10_cu132_x64
Continuous Benchmark (PR) / GPU Pytest benchmark (gh)
Workflow failed! Resource not accessible by integration
Unit-tests on Linux / tests-olddeps (3.10, 11.8) / linux-job (gh)
RuntimeError: Command docker exec -t 43e0e92da39fff59de752d1ff5eae40f7547507040ede232081a575ef3c0370a /exec failed with exit code 1

CANCELLED JOB - The following job was cancelled. Please retry:

Continuous Benchmark (PR) / CPU Pytest benchmark (gh)
Workflow failed! Resource not accessible by integration

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-cla · 2026-04-04T08:18:06Z

Hi @Lidang-Jiang!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

meta-cla · 2026-04-04T10:06:01Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

vmoens

Thanks for this!
The linter needs to be fixed, and I would mention the version where the change of behavior will occur (0.14)

vmoens · 2026-04-05T05:32:05Z

torchrl/modules/tensordict_module/actors.py

+                    warnings.warn(
+                        f"Action shape {action.shape} does not match expected shape {target_shape} "
+                        f"(per-sample spec shape: {per_sample_shape}). "
+                        f"In a future version, this will raise an error. "


2 versions from now, which is in v0.14

Fixed in e6fcc31. Updated the warning message to specify v0.14 as the deprecation version. Also fixed linter formatting issues.

- Fix linter formatting (line length) in actors.py and test_actors.py - Specify deprecation version (v0.14) in FutureWarning message Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>

vmoens

LGTM thanks

github-actions bot added BugFix Modules and removed BugFix labels Apr 4, 2026

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 4, 2026

vmoens reviewed Apr 5, 2026

View reviewed changes

fix: address reviewer feedback

e6fcc31

- Fix linter formatting (line length) in actors.py and test_actors.py - Specify deprecation version (v0.14) in FutureWarning message Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>

github-actions bot added the BugFix label Apr 5, 2026

vmoens approved these changes Apr 5, 2026

View reviewed changes

vmoens merged commit d1917f2 into pytorch:main Apr 5, 2026
93 of 122 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] Add strict_shape parameter to QValueModule for action shape enforcement#3593

[BugFix] Add strict_shape parameter to QValueModule for action shape enforcement#3593
vmoens merged 2 commits intopytorch:mainfrom
Lidang-Jiang:fix/qvalue-actor-action-spec-shape

Lidang-Jiang commented Apr 4, 2026

Uh oh!

pytorch-bot bot commented Apr 4, 2026 •

edited

Loading

Uh oh!

meta-cla bot commented Apr 4, 2026

Uh oh!

meta-cla bot commented Apr 4, 2026

Uh oh!

vmoens left a comment

Uh oh!

vmoens Apr 5, 2026

Uh oh!

Lidang-Jiang Apr 5, 2026

Uh oh!

vmoens left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Lidang-Jiang commented Apr 4, 2026

Summary

strict_shape parameter

Test plan

Uh oh!

pytorch-bot bot commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3593

❌ 28 New Failures, 1 Cancelled Job, 2 Pending

Uh oh!

meta-cla bot commented Apr 4, 2026

Action Required

Process

Uh oh!

meta-cla bot commented Apr 4, 2026

Uh oh!

vmoens left a comment

Choose a reason for hiding this comment

Uh oh!

vmoens Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Lidang-Jiang Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

vmoens left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`strict_shape` parameter

pytorch-bot bot commented Apr 4, 2026 •

edited

Loading