fix: post_rollout IndexError and missing completion (fixes #726) by gspeter-max · Pull Request #937 · PrimeIntellect-ai/verifiers

gspeter-max · 2026-02-19T07:27:19Z

Summary

Fixes #726 - post_rollout can crash with IndexError when trajectory is empty and does not have access to completion field.

Root Cause

post_rollout runs as a cleanup handler (via @vf.cleanup) BEFORE state["completion"] is set
When trajectory is empty (e.g., environment errors before any turns), accessing state["trajectory"][-1] raises IndexError
Inconsistent API: rubrics can access completion but cleanup handlers cannot

Changes

1. Code Changes

verifiers/envs/sandbox_env.py: Added comprehensive docstring + error handling wrapper in destroy_sandbox
verifiers/envs/experimental/cli_agent_env.py: Added error handling wrapper in destroy_sandbox

2. Tests

tests/test_post_rollout_safety.py (NEW): 5 comprehensive test cases
- Empty trajectory handling
- Missing trajectory key handling
- Cleanup with empty trajectory
- Error handling in destroy_sandbox
- Custom post_rollout with trajectory access
tests/test_multiturn_env.py: Added test for early error scenarios

3. Documentation

docs/environments.md: Added "Cleanup Handlers and post_rollout" section with:
- Execution order explanation
- Available state data
- Safe trajectory access patterns
- Practical examples

Test Results

All 25 tests passed ✅
- test_post_rollout_safety.py: 5/5 passed
- test_multiturn_env.py: 18/18 passed  
- test_sandbox_env.py: 2/2 passed

Impact

✅ Rollouts no longer crash when environment errors occur before any trajectory steps
✅ Environment designers have clear guidance on safe post_rollout patterns
✅ Consistent error handling across all cleanup handlers
✅ Comprehensive test coverage prevents regression
✅ Backward compatible - no breaking changes

Checklist

Refs: #726

Note

Low Risk
Small, backward-compatible robustness change limited to cleanup paths plus tests/docs; main behavior change is swallowing post_rollout exceptions to ensure sandbox deletion continues.

Overview
Prevents rollout teardown from crashing when an environment errors before any turns by wrapping post_rollout execution in destroy_sandbox (both SandboxEnv and CliAgentEnv) and logging failures while still deleting the sandbox.

Adds regression coverage for empty/missing state["trajectory"] and early-setup failures, plus documentation clarifying that @vf.cleanup handlers (including post_rollout) run before state["completion"] is set and showing safe trajectory access patterns for environment authors.

^{Written by Cursor Bugbot for commit 6bb1480. This will update automatically on new commits. Configure here.}

Root Cause: - post_rollout runs as a cleanup handler (via @vf.cleanup) BEFORE state["completion"] is set - When trajectory is empty (e.g., environment errors before any turns), accessing state["trajectory"][-1] raises IndexError - Environment designers had to add perfect error handling or risk crashes - Inconsistent API: rubrics can access completion but cleanup handlers cannot Changes: 1. Updated sandbox_env.py base post_rollout docstring with comprehensive documentation about: - When post_rollout runs (cleanup handler, BEFORE completion is set) - What state data is available (trajectory may be empty, error, sandbox_id) - Safe access pattern examples 2. Added error handling wrapper in sandbox_env.py destroy_sandbox: - Wrapped post_rollout call in try-except - Logs warning if post_rollout fails (not error, since cleanup should continue) - Ensures sandbox cleanup continues even if post_rollout fails 3. Added same error handling pattern to cli_agent_env.py destroy_sandbox: - Follows same pattern as sandbox_env.py - Maintains consistency across cleanup handlers 4. Added comprehensive tests in tests/test_post_rollout_safety.py: - test_post_rollout_with_empty_trajectory - test_post_rollout_with_missing_trajectory - test_cleanup_with_empty_trajectory - test_destroy_sandbox_handles_post_rollout_error - test_custom_post_rollout_with_trajectory_access 5. Added test to tests/test_multiturn_env.py: - test_multiturn_rollout_with_early_error - Verifies early errors don't crash cleanup 6. Updated docs/environments.md with new section: - "Cleanup Handlers and post_rollout" explaining execution order - Available state data in cleanup handlers - Safe trajectory access patterns - Examples of proper post_rollout implementation Impact: - Rollouts no longer crash when environment errors occur before any trajectory steps - Environment designers have clear guidance on safe post_rollout patterns - Consistent error handling across all cleanup handlers - Comprehensive test coverage prevents regression Refs: PrimeIntellect-ai#726

Root Cause: - ErrorDuringSetupEnv test class was missing implementation of abstract method env_response - Runtime errors were not being caught because they weren't vf.Error types Changes: - Added env_response method implementation to ErrorDuringSetupEnv - Changed RuntimeError to vf.SandboxError so error is properly caught by framework - env_response returns empty list since it should never be called due to setup error Impact: - All tests now pass (25/25) - Test properly validates early error handling with empty trajectory Refs: PrimeIntellect-ai#726

CLAassistant · 2026-02-19T07:27:35Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

docs/environments.md

…uidance Root Cause: - PR added new documentation about cleanup handlers and post_rollout to docs/environments.md - Project rule requires skills files to be updated when user-facing documentation changes Changes: - Added "Resource Management and Cleanup Handlers" section to skills/create-environments/SKILL.md - Documented critical execution order: cleanup runs BEFORE completion is set - Added safe post_rollout implementation examples - Listed available state data in cleanup handlers - Provided safe trajectory access patterns - Referenced full documentation in docs/environments.md Impact: - Skills file now reflects the new cleanup handler guidance - Users following the skill will get proper cleanup handler patterns - Addresses Cursor Bugbot feedback about missing skills update Refs: PrimeIntellect-ai#726

gspeter-max · 2026-02-19T09:46:37Z

Thanks for the feedback @cursor[bot]! I've updated the file to include the new cleanup handler and post_rollout guidance.

The changes include:

Critical execution order documentation (cleanup runs BEFORE completion)
Safe post_rollout implementation examples
Available state data in cleanup handlers
Safe trajectory access patterns
Reference to full documentation in docs/environments.md

This ensures users following the skill will get proper cleanup handler patterns.

peter-luminova added 2 commits February 19, 2026 12:41

cursor bot reviewed Feb 19, 2026

View reviewed changes

docs/environments.md Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

fix: post_rollout IndexError and missing completion (fixes #726)#937

fix: post_rollout IndexError and missing completion (fixes #726)#937
gspeter-max wants to merge 3 commits intoPrimeIntellect-ai:mainfrom
gspeter-max:verifiersContribution/issue_726

gspeter-max commented Feb 19, 2026 •

edited by cursor bot

Loading

Uh oh!

CLAassistant commented Feb 19, 2026

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

gspeter-max commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

gspeter-max commented Feb 19, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

Changes

1. Code Changes

2. Tests

3. Documentation

Test Results

Impact

Checklist

Uh oh!

CLAassistant commented Feb 19, 2026

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gspeter-max commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gspeter-max commented Feb 19, 2026 •

edited by cursor bot

Loading