Look into timeout issues possibly caused by infinite loops#54
Merged
derekmisler merged 1 commit intodocker:mainfrom Feb 26, 2026
Conversation
Signed-off-by: Derek Misler <derek.misler@docker.com>
Contributor
Author
|
/describe |
|
✅ PR description has been generated and updated! |
There was a problem hiding this comment.
✅ Review Complete
Summary
No issues found. This PR looks good!
The changes correctly implement:
- Pre-fetching of PR diffs to avoid 403 errors on forked PRs
- Timeout extension from 20 to 40 minutes
- Batch verification mode to prevent infinite retry loops
- Proper structured output schema with
verdictsarray - Three eval fixtures to validate the new behavior
All changed code is correct — no logic errors, resource leaks, or security issues detected.
trungutt
approved these changes
Feb 26, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR addresses timeout failures in the
review-praction by eliminating two root causes: an infinite loop risk where the agent could repeatedly retry verifier delegations, and slow/failing diff fetches inside the agent container (especially on forked PRs). The verifier is also refactored from single-finding to batch mode, reducing the number of LLM delegation calls per review.Changes
review-pr/action.yml: Pre-fetches the PR diff viagh pr diffbefore the agent starts and writes it topr.diff, avoiding 403 errors on forked PRs inside the agent container. Also doubles the action timeout from 20 min (1200s) to 40 min (2400s) to accommodate large diffs and retries.review-pr/agents/pr-review.yaml(orchestrator): Adds an explicit ANTI-LOOP RULE — the agent must delegate to the verifier exactly once with all high/medium findings batched together. If the verifier returns an empty or malformed response, all findings are treated as DISMISSED rather than retried. Also updates diff-fetching instructions to check for the pre-fetchedpr.difffile first.review-pr/agents/pr-review.yaml(verifier agent): Refactored from single-finding to batch mode. The structured output schema changes from a single verdict object to averdictsarray, withadditionalProperties: falseenforced at both the array item and root levels.review-pr/agents/evals/batched-verifier-security-{1,2,3}.json: Three new eval fixtures (triplicate runs for reliability) that assert the agent uses a single batch delegation to the verifier and correctly identifies a missingredirect_urivalidation as a high-severity security finding.How to Test
review-praction on a PR from a fork and confirm the diff is pre-fetched successfully (look for the✅ Pre-fetched PR difflog line) and the agent reads frompr.diffinstead of callinggh pr diff.batched-verifier-security-1/2/3.json) and verify all relevance criteria pass — particularly that the verifier is called exactly once with a batched payload and returns averdictsarray.Closes: https://github.com/docker/gordon/issues/179