Skip to content

Codex: fork/subagent replayed token_count history is double-counted, inflating per-day token totals #679

@RedesignedRobot

Description

@RedesignedRobot

Codex fan-out double-counts tokens. Each fork/subagent child file replays the parent's token_count history and the parser counts every copy.

One real day: 179 files (22 root, 157 children from 9 parents, one with 137 children). Raw ~14B, CLI reports ~12.9B, deduped real is ~1.7-1.9B. 88% of the mass is the same cumulative vectors repeated across 16-17 sibling files (identical 34-step sequence 40002 → … → 5,238,474). That tripped the daily cap on submit: Daily token total exceeds 10,000,000,000 ... 12,910,573,877.

Two guards miss it when the child doesn't re-embed the parent session_meta:

  • forked_child_turn_starts_own_session returns true on replay_session_id.is_none(), so the first turn_context closes the skip window and the replayed rows get counted.
  • codex_token_count_dedup_key keys on the child's own session id, so sibling replays don't collapse.

Naive fix (dedup by cumulative vector / key under parent) breaks test_..._deduplicates_parent_replay_across_forks: two subagents doing distinct same-size turns share a vector and must stay separate. Probably needs a cross-file pass: look up the parent's total at fork time and skip child rows at/below it.

crates/tokscale-core/src/sessions/codex.rs, should_keep_deduped_message in lib.rs. Closed #678 (thought it was a strict cap). Can send a PR if you tell me the fork-log semantics you want.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions