t1331: Supervisor circuit breaker — pause on consecutive failures by marcusquinn · Pull Request #2271 · marcusquinn/aidevops

marcusquinn · 2026-02-25T03:02:00Z

WIP - incremental commits

Ref #2264

Summary by CodeRabbit

New Features
- Implemented circuit-breaker system that halts task dispatch when consecutive failures exceed a threshold. Auto-recovers after a configurable cooldown period. Creates and updates GitHub issues to track circuit-breaker state and events.

coderabbitai · 2026-02-25T03:02:06Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d5a5bc3 and a150e85.

📒 Files selected for processing (4)

.agents/scripts/supervisor-helper.sh
.agents/scripts/supervisor/circuit-breaker.sh
.agents/scripts/supervisor/pulse.sh
todo/tasks/t1331-brief.md

Walkthrough

Implements a comprehensive circuit-breaker pattern for supervisor task dispatch. The system tracks consecutive task failures, trips dispatch prevention at a configurable threshold, persists state atomically to JSON, integrates with GitHub issue creation/updates, and supports both manual and automatic reset with cooldown periods.

Changes

Cohort / File(s)	Summary
Circuit-Breaker Core Implementation `.agents/scripts/supervisor/circuit-breaker.sh`	New 511-line module implementing full circuit-breaker state machine with persistent JSON state, atomic read/write operations, GitHub issue integration, and CLI subcommands (status, reset, check, trip). Includes cb_read_state, cb_write_state, cb_record_failure, cb_record_success, cb_check, cb_reset, cb_status, and GitHub issue management functions.
Supervisor CLI Integration `.agents/scripts/supervisor-helper.sh`	Wires circuit-breaker into supervisor CLI by sourcing the new circuit-breaker module and exposing circuit-breaker as a top-level command with dedicated cmd_circuit_breaker handler and updated help text.
Pulse Workflow Integration `.agents/scripts/supervisor/pulse.sh`	Integrates circuit-breaker hooks into task execution flow: records failures on retry/blocked paths via cb_record_failure, records successes via cb_record_success, and gates Phase 2 dispatch with a cb_check guard to skip dispatch when breaker is open.
Acceptance Criteria Verification `todo/tasks/t1331-brief.md`	Updates acceptance criteria with verification blocks to scan supervisor scripts directory for circuit-breaker pattern implementation and consecutive_fail tracking evidence.

Sequence Diagrams

sequenceDiagram
    participant Pulse as Pulse Coordinator
    participant Circuit as Circuit Breaker
    participant State as State File
    participant GitHub as GitHub API
    participant Dispatch as Phase 2 Dispatch

    rect rgba(100, 150, 200, 0.5)
    Note over Pulse,Dispatch: Task Failure Path
    Pulse->>Circuit: cb_record_failure()
    Circuit->>State: Read current state
    State-->>Circuit: state JSON
    Circuit->>Circuit: Increment consecutive_failures
    Circuit->>Circuit: Check if threshold met
    Circuit->>State: Write updated state (tripped=true)
    Circuit->>GitHub: _cb_create_or_update_issue()
    GitHub-->>Circuit: Issue created/updated
    Circuit-->>Pulse: Breaker tripped
    end

    rect rgba(150, 200, 100, 0.5)
    Note over Pulse,Dispatch: Dispatch Gate Check
    Pulse->>Circuit: cb_check()
    Circuit->>State: Read state
    State-->>Circuit: state JSON
    Circuit->>Circuit: If tripped & cooldown elapsed: auto-reset
    Circuit-->>Pulse: Dispatch allowed or blocked
    alt Breaker Open
        Pulse->>Dispatch: Skip dispatch (log message)
    else Breaker Closed
        Pulse->>Dispatch: Proceed with batch/non-batch logic
    end
    end

    rect rgba(200, 150, 100, 0.5)
    Note over Pulse,GitHub: Task Success Path
    Pulse->>Circuit: cb_record_success()
    Circuit->>State: Reset consecutive_failures to 0
    Circuit->>State: Clear tripped flag
    Circuit->>GitHub: _cb_close_issue()
    GitHub-->>Circuit: Issue closed
    Circuit-->>Pulse: Breaker reset
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Possibly related PRs

feat: add supervisor-helper.sh with SQLite schema and state machine (t128.1) #376: Introduced the supervisor-helper.sh CLI framework that this PR extends by adding the circuit-breaker module import and CLI command handler directly into the same file.

Poem

🔄 When tasks stumble three times in a row,
The circuit breaks—dispatch must slow,
A GitHub issue marks the pain,
Till cooldown chills and success flows again,
A guardian against cascading woe. 🛡️

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/t1331

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-02-25T03:02:12Z

Summary of Changes

Hello @marcusquinn, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a robust circuit breaker mechanism for the supervisor, designed to enhance system resilience and operational efficiency. By automatically detecting and responding to repeated task failures, it prevents resource exhaustion and provides clear, automated notifications through GitHub issues, allowing for timely intervention and resolution of underlying problems.

Highlights

Supervisor Circuit Breaker Implementation: Introduced a new Bash script, circuit-breaker.sh, to implement a supervisor-level circuit breaker pattern. This system tracks consecutive task failures to prevent infinite retry loops.
Automated Dispatch Pausing: The circuit breaker automatically pauses task dispatch when a configurable threshold of consecutive failures (default: 3) is reached, stopping the supervisor from wasting resources on persistently failing tasks.
GitHub Issue Integration: Upon tripping, the circuit breaker creates or updates a GitHub issue with a circuit-breaker label, providing immediate visibility and actionable information to maintainers. It also closes the issue upon reset.
Reset Mechanisms: The circuit breaker can be reset manually via a supervisor-helper.sh command or automatically after a configurable cooldown period (default: 30 minutes). Any successful task completion also resets the failure counter.
State Management: The circuit breaker's state (consecutive failures, tripped status, timestamps) is persisted in a JSON file, ensuring continuity across supervisor runs.

Changelog

.agents/scripts/supervisor/circuit-breaker.sh
- Added a new Bash script to implement the supervisor circuit breaker logic, including failure tracking, state persistence, dispatch control, and GitHub issue management.
todo/tasks/t1331-brief.md
- Added a new task brief documenting the requirements, approach, and acceptance criteria for the supervisor circuit breaker feature.

Activity

The pull request is marked as 'WIP - incremental commits', indicating ongoing development.
It references issue t1331: Supervisor circuit breaker #2264, suggesting it addresses a known problem or feature request.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions · 2026-02-25T03:02:53Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 58 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Feb 25 03:02:49 UTC 2026: Code review monitoring started
Wed Feb 25 03:02:49 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 58

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 58
VULNERABILITIES: 0

Generated on: Wed Feb 25 03:02:52 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

gemini-code-assist

Code Review

This pull request introduces a robust circuit breaker mechanism for the supervisor, which is a valuable addition for preventing resource wastage on consecutive task failures. The implementation correctly handles failure tracking, auto-reset with a configurable cooldown, manual reset, and integrates with GitHub issues for notifications. The overall logic is sound and addresses the problem effectively. However, there are several instances where command error output is suppressed using 2>/dev/null. This practice can hide critical debugging information related to jq parsing errors, date command failures, or gh CLI issues (e.g., authentication, network problems). The comments provided suggest modifications to remove 2>/dev/null or replace it with || true or || echo "0" to ensure that actual syntax or system errors remain visible for debugging, aligning with repository guidelines.

gemini-code-assist · 2026-02-25T03:04:47Z

.agents/scripts/supervisor/circuit-breaker.sh

+
+_cb_state_file() {
+	local dir="${SUPERVISOR_DIR:-${HOME}/.aidevops/.agent-workspace/supervisor}"
+	mkdir -p "$dir" 2>/dev/null || true


Suppressing all errors from mkdir -p with 2>/dev/null can hide important issues, such as permission problems if the directory already exists but is not writable. It's better to allow stderr to be visible for debugging.

Suggested change

mkdir -p "$dir" 2>/dev/null || true

mkdir -p "$dir" || true

References

Avoid using '2>/dev/null' for blanket suppression of command errors in shell scripts to ensure that authentication, syntax, or system issues remain visible for debugging.

gemini-code-assist · 2026-02-25T03:04:47Z

.agents/scripts/supervisor/circuit-breaker.sh

+	state=$(cb_read_state)
+
+	local current_count
+	current_count=$(printf '%s' "$state" | jq -r '.consecutive_failures // 0' 2>/dev/null || echo "0")


Suppressing jq errors with 2>/dev/null can hide syntax errors in the jq filter or issues with the jq executable itself. The // operator handles missing keys, so 2>/dev/null is not needed for that purpose. It's better to let jq errors be visible for debugging.

Suggested change

current_count=$(printf '%s' "$state" | jq -r '.consecutive_failures // 0' 2>/dev/null || echo "0")

current_count=$(printf '%s' "$state" | jq -r '.consecutive_failures // 0' || echo "0")

References

In shell scripts with 'set -e' enabled, use '|| true' to prevent the script from exiting when a command like 'jq' fails on an optional lookup. Do not suppress stderr with '2>/dev/null' so that actual syntax or system errors remain visible for debugging.

gemini-code-assist · 2026-02-25T03:04:47Z

.agents/scripts/supervisor/circuit-breaker.sh

+	now=$(date -u +%Y-%m-%dT%H:%M:%SZ)
+
+	local tripped
+	tripped=$(printf '%s' "$state" | jq -r '.tripped // false' 2>/dev/null || echo "false")


Suppressing jq errors with 2>/dev/null can hide syntax errors in the jq filter or issues with the jq executable itself. The // operator handles missing keys, so 2>/dev/null is not needed for that purpose. It's better to let jq errors be visible for debugging.

Suggested change

tripped=$(printf '%s' "$state" | jq -r '.tripped // false' 2>/dev/null || echo "false")

tripped=$(printf '%s' "$state" | jq -r '.tripped // false' || echo "false")

References

In shell scripts with 'set -e' enabled, use '|| true' to prevent the script from exiting when a command like 'jq' fails on an optional lookup. Do not suppress stderr with '2>/dev/null' so that actual syntax or system errors remain visible for debugging.

gemini-code-assist · 2026-02-25T03:04:47Z

.agents/scripts/supervisor/circuit-breaker.sh

+		--arg task "$task_id" \
+		--arg reason "$failure_reason" \
+		'.consecutive_failures = $count | .last_failure_at = $now | .last_failure_task = $task | .last_failure_reason = $reason' \
+		2>/dev/null)


Suppressing jq errors with 2>/dev/null can hide syntax errors in the jq filter or issues with the jq executable itself. It's better to let jq errors be visible for debugging.

Suggested change

2>/dev/null)

new_state=$(printf '%s' "$state" | jq \

--argjson count "$new_count" \

--arg now "$now" \

--arg task "$task_id" \

--arg reason "$failure_reason" \

'.consecutive_failures = $count | .last_failure_at = $now | .last_failure_task = $task | .last_failure_reason = $reason')

References

In shell scripts with 'set -e' enabled, use '|| true' to prevent the script from exiting when a command like 'jq' fails on an optional lookup. Do not suppress stderr with '2>/dev/null' so that actual syntax or system errors remain visible for debugging.

gemini-code-assist · 2026-02-25T03:04:47Z

.agents/scripts/supervisor/circuit-breaker.sh

+		new_state=$(printf '%s' "$new_state" | jq \
+			--arg now "$now" \
+			'.tripped = true | .tripped_at = $now' \
+			2>/dev/null)


Suppressing jq errors with 2>/dev/null can hide syntax errors in the jq filter or issues with the jq executable itself. It's better to let jq errors be visible for debugging.

Suggested change

2>/dev/null)

new_state=$(printf '%s' "$new_state" | jq \

--arg now "$now" \

'.tripped = true | .tripped_at = $now')

References

In shell scripts with 'set -e' enabled, use '|| true' to prevent the script from exiting when a command like 'jq' fails on an optional lookup. Do not suppress stderr with '2>/dev/null' so that actual syntax or system errors remain visible for debugging.

gemini-code-assist · 2026-02-25T03:04:48Z

.agents/scripts/supervisor/circuit-breaker.sh

+			2>/dev/null || true
+
+		local issue_url
+		issue_url=$(gh issue create \
+			--title "Supervisor circuit breaker tripped — ${failure_count} consecutive failures" \


Suppressing gh CLI errors with 2>/dev/null can hide issues with GitHub API access, authentication, or network problems. While || true makes it non-blocking, it's better to see the errors for debugging.

Suggested change

2>/dev/null || true

local issue_url

issue_url=$(gh issue create \

--title "Supervisor circuit breaker tripped — ${failure_count} consecutive failures" \

gh label create "circuit-breaker" \

--description "Supervisor circuit breaker tripped — dispatch paused" \

--color "D93F0B" \

--force || true

References

Avoid using '2>/dev/null' for blanket suppression of command errors in shell scripts to ensure that authentication, syntax, or system issues remain visible for debugging.

gemini-code-assist · 2026-02-25T03:04:48Z

.agents/scripts/supervisor/circuit-breaker.sh

+			2>/dev/null) || {
+			log_warn "circuit-breaker: failed to create GitHub issue"
+			return 1
+		}
+		log_info "circuit-breaker: created GitHub issue: $issue_url"


Suppressing gh CLI errors with 2>/dev/null can hide issues with GitHub API access, authentication, or network problems. While the || { ... } block catches the failure, 2>/dev/null prevents seeing the actual error message from gh.

Suggested change

2>/dev/null) || {

log_warn "circuit-breaker: failed to create GitHub issue"

return 1

}

log_info "circuit-breaker: created GitHub issue: $issue_url"

issue_url=$(gh issue create \

--title "Supervisor circuit breaker tripped — ${failure_count} consecutive failures" \

--body "$body" \

--label "circuit-breaker") || {

References

Avoid using '2>/dev/null' for blanket suppression of command errors in shell scripts to ensure that authentication, syntax, or system issues remain visible for debugging.

gemini-code-assist · 2026-02-25T03:04:48Z

.agents/scripts/supervisor/circuit-breaker.sh

+		--state open \
+		--json number \
+		--jq '.[0].number // empty' \
+		2>/dev/null) || existing_issue=""
+


Suppressing gh CLI errors with 2>/dev/null can hide issues with GitHub API access, authentication, or network problems. It's better to let these errors be visible for debugging.

Suggested change

--state open \

--json number \

--jq '.[0].number // empty' \

2>/dev/null) || existing_issue=""

existing_issue=$(gh issue list \

--label "circuit-breaker" \

--state open \

--json number \

--jq '.[0].number // empty') || existing_issue=""

References

Avoid using '2>/dev/null' for blanket suppression of command errors in shell scripts to ensure that authentication, syntax, or system issues remain visible for debugging.

gemini-code-assist · 2026-02-25T03:04:48Z

.agents/scripts/supervisor/circuit-breaker.sh

+	fi
+
+	gh issue close "$existing_issue" \
+		--comment "Circuit breaker reset: ${reason}" \


Suppressing gh CLI errors with 2>/dev/null can hide issues with GitHub API access, authentication, or network problems. While the || { ... } block catches the failure, 2>/dev/null prevents seeing the actual error message from gh.

Suggested change

--comment "Circuit breaker reset: ${reason}" \

gh issue close "$existing_issue" \

--comment "Circuit breaker reset: ${reason}" || {

References

Avoid using '2>/dev/null' for blanket suppression of command errors in shell scripts to ensure that authentication, syntax, or system issues remain visible for debugging.

gemini-code-assist · 2026-02-25T03:04:48Z

.agents/scripts/supervisor/circuit-breaker.sh

+		state=$(jq -n \
+			--argjson count "$CIRCUIT_BREAKER_THRESHOLD" \
+			--arg now "$now" \
+			--arg task "$task_id" \
+			--arg reason "$reason" \
+			'{consecutive_failures: $count, tripped: true, tripped_at: $now, last_failure_at: $now, last_failure_task: $task, last_failure_reason: $reason}')


Suppressing jq errors with 2>/dev/null can hide syntax errors in the jq filter or issues with the jq executable itself. It's better to let jq errors be visible for debugging.

Suggested change

state=$(jq -n \

--argjson count "$CIRCUIT_BREAKER_THRESHOLD" \

--arg now "$now" \

--arg task "$task_id" \

--arg reason "$reason" \

'{consecutive_failures: $count, tripped: true, tripped_at: $now, last_failure_at: $now, last_failure_task: $task, last_failure_reason: $reason}')

state=$(jq -n \

--argjson count "$CIRCUIT_BREAKER_THRESHOLD" \

--arg now "$now" \

--arg task "$task_id" \

--arg reason "$reason" \

'{consecutive_failures: $count, tripped: true, tripped_at: $now, last_failure_at: $now, last_failure_task: $task, last_failure_reason: $reason}')

References

In shell scripts with 'set -e' enabled, use '|| true' to prevent the script from exiting when a command like 'jq' fails on an optional lookup. Do not suppress stderr with '2>/dev/null' so that actual syntax or system errors remain visible for debugging.

New file: .agents/scripts/supervisor/circuit-breaker.sh - Tracks consecutive task failures globally via state file - Trips after N failures (default: 3, configurable via SUPERVISOR_CIRCUIT_BREAKER_THRESHOLD) - Pauses dispatch when tripped, creates/updates GitHub issue with circuit-breaker label - Manual reset via supervisor-helper.sh circuit-breaker reset - Auto-reset after configurable cooldown (SUPERVISOR_CIRCUIT_BREAKER_COOLDOWN_SECS) - Counter resets on any task success

- Remove unused 'cooldown_secs' variable declaration in cb_status() - Remove unused 'repo_path' variable in _cb_create_or_update_issue() - Add blank lines before fenced code blocks in t1331-brief.md (MD031)

… into pulse Remove blanket 2>/dev/null suppression from jq, gh, date, and mkdir commands in circuit-breaker.sh per Codacy review. Errors now flow to stderr for debugging while || fallbacks prevent script abort. Integrate cb_record_success/cb_record_failure/cb_check into pulse.sh: - Phase 1: record success on task completion, failure on retry/blocked - Phase 2: gate dispatch on cb_check (skip if circuit breaker tripped) Wire circuit-breaker subcommand in supervisor-helper.sh main router. Ref #2264

augmentcode · 2026-02-25T14:26:44Z

🤖 Augment PR Summary

Summary: This PR introduces a supervisor-wide circuit breaker to pause dispatch after consecutive task failures.

Changes:

Added .agents/scripts/supervisor/circuit-breaker.sh to persist breaker state and provide record_failure, record_success, check, status, and reset operations.
Wired the module into supervisor-helper.sh (new circuit-breaker subcommand and module sourcing).
Integrated breaker tracking into pulse.sh: reset counter on task completion, record failures on retry/blocked outcomes, and skip Phase 2 dispatch when the breaker is open.
Implemented optional GitHub issue creation/update (label circuit-breaker) on trip and issue close on reset via gh.

Technical Notes: Threshold and cooldown are configurable via env vars; dispatch can auto-resume after cooldown or via manual reset.

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. 5 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-02-25T14:26:45Z

.agents/scripts/supervisor/circuit-breaker.sh

+	state_file=$(_cb_state_file)
+
+	if [[ -f "$state_file" ]]; then
+		cat "$state_file"


Because supervisor-helper.sh runs with set -euo pipefail, a read failure here (e.g., state file exists but isn’t readable) can abort the entire supervisor run even though the circuit breaker is intended to be non-blocking. Consider making state reads resilient (treat read errors as “no state” and continue).

Severity: high

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-02-25T14:26:45Z

.agents/scripts/supervisor/circuit-breaker.sh

+
+	# Atomic write via temp file + mv
+	local tmp_file="${state_file}.tmp.$$"
+	printf '%s\n' "$state_json" >"$tmp_file"


With set -euo pipefail enabled in the caller, a write failure (permissions/disk full) on this redirection can exit the supervisor before the circuit-breaker functions can return 0. Consider explicitly handling write/move failures so the breaker can fail-open without killing dispatch.

Severity: high

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-02-25T14:26:45Z

.agents/scripts/supervisor/circuit-breaker.sh

+
+	# Update state with new failure count
+	local new_state
+	new_state=$(printf '%s' "$state" | jq \


This jq call is unguarded, so if it fails (missing jq, corrupted JSON, etc.) it can terminate the supervisor under set -euo pipefail before the subsequent -z "$new_state" check runs. Consider ensuring JSON update failures are non-fatal in all cb_* paths.

Severity: high

Other Locations

.agents/scripts/supervisor/circuit-breaker.sh:161

.agents/scripts/supervisor/circuit-breaker.sh:246

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-02-25T14:26:45Z

.agents/scripts/supervisor/circuit-breaker.sh

+	local tripped count tripped_at last_failure last_reset
+	tripped=$(printf '%s' "$state" | jq -r '.tripped // false' || echo "false")
+	count=$(printf '%s' "$state" | jq -r '.consecutive_failures // 0' || echo "0")
+	tripped_at=$(printf '%s' "$state" | jq -r '.tripped_at // "never"' || echo "never")


The state defaults use empty-string timestamps, but jq’s // won’t replace empty strings, so tripped_at/last_reset_at may print blank (and the cooldown math may try to parse an empty date). Consider normalizing empty strings to a sentinel like never before display/math.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

augmentcode · 2026-02-25T14:26:45Z

.agents/scripts/supervisor/pulse.sh

 			retry)
 				log_warn "  $tid: RETRY ($outcome_detail)"
+				# t1331: Record failure for circuit breaker (consecutive failure tracking)
+				cb_record_failure "$tid" "$outcome_detail" 2>>"$SUPERVISOR_LOG" || true


Right now the breaker increments on retry/blocked, but the failed) outcome path later in this same case doesn’t record a failure, so repeated hard failures may never trip the breaker. Consider whether failed (or at least non-environment failed) should also feed cb_record_failure to match “consecutive task failures” semantics.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

github-actions · 2026-02-25T14:27:22Z

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 71 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Feb 25 14:27:18 UTC 2026: Code review monitoring started
Wed Feb 25 14:27:18 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 71

📈 Current Quality Metrics

BUGS: 0
CODE SMELLS: 71
VULNERABILITIES: 0

Generated on: Wed Feb 25 14:27:21 UTC 2026

Generated by AI DevOps Framework Code Review Monitoring

sonarqubecloud · 2026-02-25T14:28:02Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

gemini-code-assist bot reviewed Feb 25, 2026

View reviewed changes

This was referenced Feb 25, 2026

t1331: Supervisor circuit breaker #2264

Closed

t1329: Cross-review judge pipeline and /cross-review slash command #2262

Closed

[Supervisor:marcusquinn] 0 queued, 0 working, 7 in review, 7 blocked at 09:17 UTC #2199

Closed

marcusquinn force-pushed the feature/t1331 branch from da2f1d5 to a309c59 Compare February 25, 2026 14:22

marcusquinn marked this pull request as ready for review February 25, 2026 14:22

marcusquinn added 3 commits February 25, 2026 14:25

fix: resolve Codacy issues — remove unused vars, fix MD031 blank lines

2cc1fe3

- Remove unused 'cooldown_secs' variable declaration in cb_status() - Remove unused 'repo_path' variable in _cb_create_or_update_issue() - Add blank lines before fenced code blocks in t1331-brief.md (MD031)

marcusquinn force-pushed the feature/t1331 branch from a309c59 to a150e85 Compare February 25, 2026 14:26

augmentcode bot reviewed Feb 25, 2026

View reviewed changes

marcusquinn merged commit 9bd2029 into main Feb 25, 2026
14 checks passed

	current_count=$(printf '%s' "$state" \| jq -r '.consecutive_failures // 0' 2>/dev/null \|\| echo "0")
	current_count=$(printf '%s' "$state" \| jq -r '.consecutive_failures // 0' \|\| echo "0")

	tripped=$(printf '%s' "$state" \| jq -r '.tripped // false' 2>/dev/null \|\| echo "false")
	tripped=$(printf '%s' "$state" \| jq -r '.tripped // false' \|\| echo "false")

	--comment "Circuit breaker reset: ${reason}" \
	gh issue close "$existing_issue" \
	--comment "Circuit breaker reset: ${reason}" \|\| {

Conversation

marcusquinn commented Feb 25, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

gemini-code-assist bot commented Feb 25, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions bot commented Feb 25, 2026

🔍 Code Quality Report

📈 Current Quality Metrics

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot commented Feb 25, 2026

Uh oh!

augmentcode bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 25, 2026

🔍 Code Quality Report

marcusquinn commented Feb 25, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 25, 2026 •

edited

Loading