Skip to content

Comments

Fix PythonEnv deadlock#652

Merged
mikasenghaas merged 7 commits intomainfrom
fix-pythonenv-deadlock
Dec 21, 2025
Merged

Fix PythonEnv deadlock#652
mikasenghaas merged 7 commits intomainfrom
fix-pythonenv-deadlock

Conversation

@mikasenghaas
Copy link
Member

@mikasenghaas mikasenghaas commented Dec 21, 2025

Description

Fixes for PythonEnv:

  1. Correctly handle command timeout errors in python tool: Before, the timeout would be caught in bash but because the error string would not be JSON-parseable, it would error with JSONDecodeError in python. Now, it correctly handles this edge case, and shows the model that the previous command has timed out
  2. Handle the case where the Python worker is dead (e.g. the worker proc is not running anymore) explicitly. Before, it would be handled as a command timeout error because the read from the FIFO queue times out, which would propagate to the model as a regular command timeout. Now, instead, the rollout aborts with a sandbox error
  3. Add instructions on which extra packages are available

Examples

Fix 1: Command timeouts

Patch the python tool by adding to L230-231 to simulate a model response that generates a timeout

        if random.random() < 0.5:
            code = """while True: pass"""

Now, the timeout correctly catches and shows an informative error to the model

uv run vf-eval math-python -n1 -r1 -v -a '{"sandbox_timeout_per_command_seconds": 5}'
Screenshot 2025-12-21 at 1 33 11 PM

Fix 2: Python worker dead

Patch the worker script tool by adding to L115 to simulate the worker dying after the first tool call

            sys.exit(1)

Now, the crash is detected and raised as custom PythonWorkerDead error

uv run vf-eval math-python -n1 -r1 -v
Screenshot 2025-12-21 at 2 12 58 PM

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Test improvement

Testing

  • All existing tests pass when running uv run pytest locally.
  • New tests have been added to cover the changes

Checklist

  • My code follows the style guidelines of this project as outlined in AGENTS.md
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Additional Notes


Note

Detects dead Python worker and robustly parses timeouts/non-JSON outputs, adds pip package availability to the system prompt, and exposes sandbox client max workers.

  • PythonEnv (verifiers/envs/python_env.py):
    • Add PythonWorkerDeadError and PID tracking (_WORKER_PID_FILE), writing PID on start and checking liveness before requests.
    • Replace ready wait with _CHECK_WORKER_READY_SCRIPT; improve readiness logging.
    • Handle non-JSON worker outputs by wrapping as { "status": "error" } instead of raising JSONDecodeError.
    • Add debug log of executed code.
  • SandboxEnv (verifiers/envs/sandbox_env.py):
    • Reduce retry before_sleep log level from ERROR to WARNING.
  • Math Python Environment (environments/math_python/math_python.py):
    • Include pip package availability in the system prompt.
    • Expose sandbox_client_max_workers and pass through to PythonEnv.

Written by Cursor Bugbot for commit 5605493. This will update automatically on new commits. Configure here.

@mikasenghaas mikasenghaas marked this pull request as ready for review December 21, 2025 19:24
@mikasenghaas mikasenghaas requested a review from snimu December 21, 2025 19:36
@mikasenghaas mikasenghaas merged commit a9acdf7 into main Dec 21, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants