Skip to content

Comments

Add robust token usage tracking#858

Merged
willccbb merged 3 commits intomainfrom
will/usage-refactor
Feb 8, 2026
Merged

Add robust token usage tracking#858
willccbb merged 3 commits intomainfrom
will/usage-refactor

Conversation

@willccbb
Copy link
Member

@willccbb willccbb commented Feb 8, 2026

Description

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Test improvement

Testing

  • All existing tests pass when running uv run pytest locally.
  • New tests have been added to cover the changes

Checklist

  • My code follows the style guidelines of this project as outlined in AGENTS.md
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Additional Notes


Note

Medium Risk
Touches core rollout generation (Environment.get_model_response) and output serialization, so incorrect usage parsing could affect all evaluations/saved results, but changes are additive with legacy fallbacks and test coverage.

Overview
Adds state-level token usage tracking via a new StateUsageTracker, accumulating usage directly in Environment.get_model_response and exposing a read-only state["usage"] plus Environment.get_state_usage().

Updates state_to_output/states_to_outputs to prefer tracked state usage (with legacy trajectory fallback) and hardens token extraction/serialization against invalid provider usage payloads. Evaluation output now prints average token usage, and new tests cover tracking behavior, output emission rules, and invalid usage values; verifiers.__init__ TYPE_CHECKING stubs are adjusted for optional verifiers-rl exports.

Written by Cursor Bugbot for commit 455850e. This will update automatically on new commits. Configure here.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

timing: RolloutTiming | None
error: Error | None
usage: TokenUsage | None
usage_tracker: object
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing documentation for new State usage fields

Low Severity

This PR adds new usage and usage_tracker fields to the State class, which is documented in docs/reference.md. The documentation lists State fields in tables under "Fields set during initialization" and "Fields set after scoring," but these new fields are not included. Per the review rules, PRs that modify core user-facing functionality described in docs must update the relevant documentation.

Fix in Cursor Fix in Web

@willccbb willccbb merged commit 9e0c09b into main Feb 8, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant