Skip to content

Comments

RLM: Eager sandbox creation, conditional pip install#834

Merged
snimu merged 9 commits intomainfrom
sebastian/rlm-eager-sandbox-creation-2026-02-05
Feb 6, 2026
Merged

RLM: Eager sandbox creation, conditional pip install#834
snimu merged 9 commits intomainfrom
sebastian/rlm-eager-sandbox-creation-2026-02-05

Conversation

@snimu
Copy link
Contributor

@snimu snimu commented Feb 6, 2026

Description

Two changes:

  • sandboxes are created eagerly instead of lazily at the first call to the REPL
  • pip_install_packages are only pip-installed if they don't already exist

Both changes are meant to make mini-swe-agent-plus work with the RLM.

The eager sandbox startup is important because some models are too dumb to ever call the REPL, so no sandbox is ever created; but the tests rely on the sandbox existing. Eager Sandbox creation solves this issue.

The pip_install_packages thing is because the Docker images installed on the Sandboxes in mini-swe-agent-plus don't contain pip, but do already have the required requests package. The generalized fix is to check that requests and all pip_install_packages exist, one by one, and then install the ones that don't. This way, if a Docker image already contains a package, we don't re-install it.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Test improvement

Testing

  • All existing tests pass when running uv run pytest locally.
  • New tests have been added to cover the changes

Checklist

  • My code follows the style guidelines of this project as outlined in AGENTS.md
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Note

Medium Risk
Touches sandbox lifecycle and package installation paths, which can affect rollout startup/cleanup behavior and reliability across different images/environments.

Overview
Makes RLMEnv start the execution backend eagerly during setup_state (including sandbox/worker preparation) instead of waiting for the first REPL call, and adds best-effort cleanup if setup fails to avoid leaking active rollouts/tunnels/sandboxes.

Updates sandbox package handling to only install requests/pip_install_packages when they are not already importable in the image (using per-package import checks and python -m pip), and adjusts rlm_secrets to pass puzzle files via a temporary context_dir rather than writing directly into rlm_fs_root. Docs are updated to reflect the eager startup and conditional installs.

Written by Cursor Bugbot for commit 5fcaaa1. This will update automatically on new commits. Configure here.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

@snimu snimu merged commit d5c2c38 into main Feb 6, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant