Skip to content

Comments

abort_on_code_timeout; better error handling#659

Merged
willccbb merged 1 commit intomainfrom
sebastian/fix-rlm-timeout-errors
Dec 22, 2025
Merged

abort_on_code_timeout; better error handling#659
willccbb merged 1 commit intomainfrom
sebastian/fix-rlm-timeout-errors

Conversation

@snimu
Copy link
Contributor

@snimu snimu commented Dec 22, 2025

Description

Small fixes to make the RLMEnv run with current verifiers.

  • fixes sandbox startup & python installation process
  • adds abort_on_code_timeout parameter
    • set to False to give the model an error message on sandbox execution timeout
    • set to True to stop the rollout with an error on sandbox execution timeout; this is useful because some models just keep writing inefficient code until their context window is full, which can take a very long time

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Test improvement

Testing

  • All existing tests pass when running uv run pytest locally.
  • New tests have been added to cover the changes

Checklist

  • My code follows the style guidelines of this project as outlined in AGENTS.md
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Additional Notes


Note

Introduce abort_on_code_timeout to control rollout on code timeouts; improve worker startup synchronization, adjust timeouts, and return structured errors on execution timeouts.

  • RLMEnv:
    • Timeout behavior:
      • Add abort_on_code_timeout flag; on code CommandTimeoutError, either abort rollout or return a structured error to the model.
      • Set documented code_execution_timeout default to 120 and remove retry in code execution.
    • Sandbox worker startup:
      • Wait for worker script after pip installs; simplify spawn logic and tune sleeps/iterations.
      • Use explicit timeouts for start and ready checks, with clearer failure messages and debug info.
    • Error handling:
      • Standardize exception chaining (raise vf.SandboxError() from e) across startup and execution paths.

Written by Cursor Bugbot for commit 93d4b88. This will update automatically on new commits. Configure here.

@willccbb willccbb merged commit 37581a2 into main Dec 22, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants