Skip to content

Conversation

@pakrym-oai
Copy link
Collaborator

Two fixes:

  1. Include trailing tool output in the total context size calculation. Otherwise when checking whether compaction should run we ignore newly added outputs.
  2. Trim trailing tool output/tool calls until we can fit the request into the model context size. Otherwise the compaction endpoint will fail to compact. We only trim items that can be reproduced again by the model (tool calls, tool call outputs).

Move estimated token count logging to after sampling with token limit info and add Session::get_estimated_token_count helper. Simplify history test by removing unnecessary clone.
@jif-oai
Copy link
Collaborator

jif-oai commented Feb 2, 2026

@codex review

Copy link
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 24a1dd9626

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +356 to +363
pub(crate) fn is_codex_generated_item(item: &ResponseItem) -> bool {
matches!(
item,
ResponseItem::FunctionCallOutput { .. }
| ResponseItem::CustomToolCall { .. }
| ResponseItem::CustomToolCallOutput { .. }
| ResponseItem::LocalShellCall { .. }
) || matches!(item, ResponseItem::Message { role, .. } if role == "developer")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Treat tool calls as codex-generated for trimming

The new trimming logic in compact_remote.rs relies on is_codex_generated_item to decide which trailing items can be removed. That helper now treats tool outputs as codex-generated but omits ResponseItem::FunctionCall. If a session ends with a tool call that never produced an output (e.g., aborted/failed tool invocation), trimming will stop at that last FunctionCall, even though normalize_history() later synthesizes an output for it. In that scenario, the compaction request can still exceed the context window and fail, which defeats the new trimming safeguard. Consider classifying ResponseItem::FunctionCall as codex-generated (or normalize before trimming) so trailing tool calls without outputs are also removable.

Useful? React with 👍 / 👎.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this make sense as we will never send back this function call in the context anyway

Copy link
Collaborator

@jif-oai jif-oai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big fan but I guess this is necessary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants