Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions environments/rlm_secrets/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,3 +93,9 @@ This environment is specifically designed to test RLM capabilities:
5. **Tests information flow**: Data must flow: file → sub-LLM → root-LLM → tool → answer

The puzzle is simple enough that models should be able to solve it, while being complex enough to exercise all RLM components.

## Changelog

- v0.1.1 (01 Feb 2026):
- add default "rlm-secrets" label to the `sandbox_labels` no matter what the user passes ther in the kwargs
- dedupe `sandbox_labels` if passed via the kwargs
2 changes: 1 addition & 1 deletion environments/rlm_secrets/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
name = "rlm-secrets"
description = "File puzzle environment for testing RLM capabilities: root tools, sub-LLM tools, file operations"
tags = ["multi-turn", "rlm", "tools", "eval"]
version = "0.1.0"
version = "0.1.1"
requires-python = ">=3.10"
dependencies = [
"verifiers>=0.1.8",
Expand Down
11 changes: 11 additions & 0 deletions environments/rlm_secrets/rlm_secrets.py
Original file line number Diff line number Diff line change
Expand Up @@ -483,6 +483,16 @@ def load_environment(
weights=[0.5, 0.5],
)

sandbox_labels = kwargs.pop("sandbox_labels", [])
if not (
isinstance(sandbox_labels, list)
and all(isinstance(label, str) for label in sandbox_labels)
):
raise ValueError(
f"sandbox_labels must be of type list[str]; you provided {sandbox_labels}"
)
sandbox_labels = list(set(["rlm-secrets"] + sandbox_labels))

return RLMSecretsEnv(
dataset=train_dataset,
num_files=num_files,
Expand All @@ -492,5 +502,6 @@ def load_environment(
sub_tool_max_turns=sub_tool_max_turns,
max_sub_llm_parallelism=max_sub_llm_parallelism,
code_execution_timeout=code_execution_timeout,
sandbox_labels=sandbox_labels,
**kwargs,
)
Loading