Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
347 changes: 347 additions & 0 deletions .github/scripts/ci_failure_bot.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,347 @@
#!/usr/bin/env python3
"""
CI Failure Bot - AI-powered analysis of build failures using Gemini
"""

import io
import json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if a contributor breaks this logic? If we don't add any automated test at all for this, it could stop working any time and we wouldn't even know until the change is merged to master and new PRs will be opened. We need some tests for this that run only in openwisp-utils.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nemesifier , Added comprehensive documentation following the structure you linked and a full test suite that covers all the bot functions. Now if someone breaks the logic, the tests will catch it before it reaches master.

import os
import sys
import zipfile

import requests
from github import Github, GithubException
from google import genai


class CIFailureBot:
def __init__(self):
self.github_token = os.environ.get("GITHUB_TOKEN")
self.gemini_api_key = os.environ.get("GEMINI_API_KEY")
self.workflow_run_id = os.environ.get("WORKFLOW_RUN_ID")
self.repository_name = os.environ.get("REPOSITORY")
self.pr_number = os.environ.get("PR_NUMBER")

if not all(
[
self.github_token,
self.gemini_api_key,
self.workflow_run_id,
self.repository_name,
]
):
missing = []
if not self.github_token:
missing.append("GITHUB_TOKEN")
if not self.gemini_api_key:
missing.append("GEMINI_API_KEY")
if not self.workflow_run_id:
missing.append("WORKFLOW_RUN_ID")
if not self.repository_name:
missing.append("REPOSITORY")
print(f"Missing required environment variables: {', '.join(missing)}")
sys.exit(1)

try:
self.workflow_run_id = int(self.workflow_run_id)
except ValueError:
print("Invalid WORKFLOW_RUN_ID: must be numeric")
sys.exit(1)

self.github = Github(self.github_token)
self.repo = self.github.get_repo(self.repository_name)

# Initialize Gemini client with new API
self.client = genai.Client(api_key=self.gemini_api_key)
self.model_name = os.environ.get("GEMINI_MODEL", "gemini-2.5-flash")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Google Gemini available models gemini-2.5-flash 2025 2026 API

💡 Result:

Short answer: gemini-2.5-flash is a GA model (released June 17, 2025) and is available via Google’s Gemini API and Vertex AI. Google lists its discontinuation/earliest retirement date as June 17, 2026, so you should plan to migrate before then and check the docs for any newer models (e.g., Gemini 3 family) or changes. [1][2][3]

Sources

  • Vertex AI / Gemini 2.5 Flash model page (release, GA status, regions, retire date). [1]
  • Gemini API changelog (release notes around June–Dec 2025). [2]
  • Gemini deprecations table (model release/earliest shutdown dates including gemini-2.5-flash). [3]

Plan migration of Gemini model before June 17, 2026 deprecation date.

The model "gemini-2.5-flash" is valid and currently GA, but has an earliest retirement date of June 17, 2026. Since the code will stop working after that date, consider migrating to a newer model (e.g., Gemini 3 family or later 2.x models) and update the fallback to ensure long-term compatibility.

🤖 Prompt for AI Agents
In @.github/scripts/ci_failure_bot.py at line 56, The default Gemini model
fallback self.model_name = os.environ.get("GEMINI_MODEL", "gemini-2.5-flash")
will retire on 2026-06-17; update the fallback to a currently supported
successor (e.g., a Gemini 3 family model or a later 2.x GA model) and ensure
GEMINI_MODEL remains overridable via env; change the default string used in the
os.environ.get call (symbol: self.model_name) to the new model name, and add a
short code comment referencing the migration/deprecation date so future
maintainers know why the default was updated.


def get_build_logs(self):
"""Get actual build logs and error output from failed jobs"""
try:
workflow_run = self.repo.get_workflow_run(self.workflow_run_id)
jobs = workflow_run.jobs()

build_logs = []
for job in jobs:
if job.conclusion == "failure":
# Get job logs URL and fetch content
logs_url = job.logs_url
if logs_url:
headers = {
"Authorization": f"token {self.github_token}",
"Accept": "application/vnd.github.v3+json",
}
response = requests.get(logs_url, headers=headers, timeout=30)
response.raise_for_status()

# Handle ZIP archive response from GitHub Actions logs API
raw = response.content
if raw[:2] == b"PK": # ZIP file signature
with zipfile.ZipFile(io.BytesIO(raw)) as zf:
parts = []
for name in zf.namelist():
if name.endswith(".txt"):
parts.append(
zf.read(name).decode("utf-8", "replace")
)
log_text = "\n".join(parts).strip()
else:
log_text = raw.decode("utf-8", "replace")

if len(log_text) > 5000:
# Take first 2000 and last 3000 chars for better context
log_text = (
log_text[:2000]
+ "\n\n[...middle truncated...]\n\n"
+ log_text[-3000:]
)
build_logs.append(
{
"job_name": job.name,
"logs": log_text,
}
)

# Also get step details
for step in job.steps:
if step.conclusion == "failure":
build_logs.append(
{
"job_name": job.name,
"step_name": step.name,
"step_number": step.number,
}
)

return build_logs
except (GithubException, requests.RequestException, ValueError) as e:
print(f"Error getting build logs: {e}")
return []

def get_pr_diff(self):
"""Get the PR diff/changes if PR exists"""
if not self.pr_number or self.pr_number.strip() == "":
return None

try:
pr_num = int(self.pr_number)
pr = self.repo.get_pull(pr_num)

# Get diff content
diff_url = pr.diff_url
headers = {
"Authorization": f"token {self.github_token}",
"Accept": "application/vnd.github.v3.diff",
}
response = requests.get(diff_url, headers=headers, timeout=30)
if response.status_code == 200:
diff_text = response.text
if len(diff_text) > 8000:
# Take first 4000 and last 4000 chars for context
diff_text = (
diff_text[:4000]
+ "\n\n[...middle truncated...]\n\n"
+ diff_text[-4000:]
)

return {
"title": pr.title,
"body": pr.body or "",
"diff": diff_text,
}
except (GithubException, requests.RequestException, ValueError) as e:
print(f"Error getting PR diff: {e}")

return None

def get_workflow_yaml(self):
"""Get the workflow YAML configuration"""
try:
workflow_run = self.repo.get_workflow_run(self.workflow_run_id)
workflow_path = workflow_run.path

# Get workflow file content from the commit that ran
workflow_file = self.repo.get_contents(
workflow_path, ref=workflow_run.head_sha
)
return workflow_file.decoded_content.decode("utf-8")
except GithubException as e:
print(f"Error getting workflow YAML: {e}")
return None

def analyze_with_gemini(self, build_logs, pr_diff, workflow_yaml):
"""Send context to Gemini for intelligent analysis"""

# Prepare context for Gemini
project_name = self.repository_name.split("/")[-1]
repo_url = f"https://github.com/{self.repository_name}"
qa_checks_url = f"{repo_url}/blob/master/openwisp-qa-check"
runtests_url = f"{repo_url}/blob/master/runtests"
Comment on lines +178 to +179
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate and read the file around the mentioned lines
cd /tmp/repo 2>/dev/null || true
cat -n .github/scripts/ci_failure_bot.py | sed -n '140,170p'

Repository: openwisp/openwisp-utils

Length of output: 1616


🏁 Script executed:

# Check if this is a GitHub API related class to understand what self.repo is
cat -n .github/scripts/ci_failure_bot.py | head -50

Repository: openwisp/openwisp-utils

Length of output: 1975


🏁 Script executed:

# Check the repository's actual default branch
git symbolic-ref refs/remotes/origin/HEAD | sed 's@.*origin/@@'

Repository: openwisp/openwisp-utils

Length of output: 73


🏁 Script executed:

# Check PyGithub version and imports
grep -r "github\|PyGithub" .github/scripts/ci_failure_bot.py setup.py setup.cfg requirements.txt 2>/dev/null | head -20

Repository: openwisp/openwisp-utils

Length of output: 1228


🏁 Script executed:

# Search for any existing usage of default_branch in the codebase
rg "default_branch" --type python

Repository: openwisp/openwisp-utils

Length of output: 97


🏁 Script executed:

# Check PyGithub documentation for Repository class attributes
python3 << 'EOF'
try:
    from github import Repository
    repo_class = Repository.Repository
    # Check if default_branch is available
    if hasattr(repo_class, 'default_branch'):
        print("default_branch attribute exists in Repository class")
    
    # Try to inspect the attributes
    import inspect
    members = inspect.getmembers(repo_class)
    default_branch_members = [m for m in members if 'default' in m[0].lower() or 'branch' in m[0].lower()]
    print(f"Found {len(default_branch_members)} relevant attributes:")
    for name, _ in default_branch_members[:10]:
        print(f"  - {name}")
except Exception as e:
    print(f"Error: {e}")
EOF

Repository: openwisp/openwisp-utils

Length of output: 98


🏁 Script executed:

# Check if there's a requirements file to see PyGithub version
find . -name "requirements*.txt" -o -name "setup.py" -o -name "setup.cfg" -o -name "pyproject.toml" | head -5 | xargs cat 2>/dev/null | grep -i "github\|pygithub"

Repository: openwisp/openwisp-utils

Length of output: 223


🏁 Script executed:

# Let me check the actual PyGithub documentation by looking at common usage patterns
web_search "PyGithub Repository default_branch attribute"

Repository: openwisp/openwisp-utils

Length of output: 117


🏁 Script executed:

# Also check if there are any other references to branch names in the codebase
rg "blob/.*/" .github/scripts/ --type python -B 2 -A 2

Repository: openwisp/openwisp-utils

Length of output: 97


🏁 Script executed:

# Check the setup.py to see PyGithub version specification
cat setup.py | grep -A 20 -B 5 "install_requires\|dependencies"

Repository: openwisp/openwisp-utils

Length of output: 963


🌐 Web query:

PyGithub Repository class default_branch property documentation

💡 Result:

The Repository.default_branch property is a read-only string property that returns the repository’s default branch name (e.g., "main"). To change it use repo.edit(default_branch='branch-name') rather than assigning to the attribute. [1][2]

Example:
from github import Github
repo = Github().get_repo('owner/repo')
print(repo.default_branch) # returns string
repo.edit(default_branch='release') # to change the default branch

Sources:
[1] PyGithub Repository docs — property default_branch.
[2] StackOverflow: default_branch is read-only; use repo.edit(...).


🏁 Script executed:

# Let me check if there are other files that might show PyGithub version requirements
fd -e txt -e cfg -e toml -e yaml | xargs grep -l "PyGithub\|pygithub" 2>/dev/null

Repository: openwisp/openwisp-utils

Length of output: 51


🏁 Script executed:

# Check all Python files in .github/scripts to see complete context
ls -la .github/scripts/*.py

Repository: openwisp/openwisp-utils

Length of output: 150


Replace hardcoded "master" branch with dynamic default branch resolution.

Lines 154-155 hardcode the "master" branch name in URLs. Repositories using "main" or other default branch names will have broken links. Use PyGithub's default_branch property to dynamically resolve the branch name.

🔧 Proposed fix
         repo_url = f"https://github.com/{self.repository_name}"
-        qa_checks_url = f"{repo_url}/blob/master/openwisp-qa-check"
-        runtests_url = f"{repo_url}/blob/master/runtests"
+        default_branch = self.repo.default_branch
+        qa_checks_url = f"{repo_url}/blob/{default_branch}/openwisp-qa-check"
+        runtests_url = f"{repo_url}/blob/{default_branch}/runtests"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
qa_checks_url = f"{repo_url}/blob/master/openwisp-qa-check"
runtests_url = f"{repo_url}/blob/master/runtests"
repo_url = f"https://github.com/{self.repository_name}"
default_branch = self.repo.default_branch
qa_checks_url = f"{repo_url}/blob/{default_branch}/openwisp-qa-check"
runtests_url = f"{repo_url}/blob/{default_branch}/runtests"
🤖 Prompt for AI Agents
In @.github/scripts/ci_failure_bot.py around lines 154 - 155, The hardcoded
"master" in qa_checks_url and runtests_url breaks links for repos with other
default branches; fetch the repository's default branch via the PyGithub Repo
object's default_branch property (e.g., repo.default_branch) and use that value
instead of the literal "master" when formatting repo_url-derived links (update
the expressions that build qa_checks_url and runtests_url to interpolate the
resolved default_branch); also handle missing/default_branch gracefully
(fallback to "master" or raise a clear error) to avoid runtime failures.


Comment on lines +175 to +180
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check repository structure and default branch

echo "=== Checking default branch ==="
git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's@^refs/remotes/origin/@@' || echo "Could not determine default branch from git"

echo -e "\n=== Checking for openwisp-qa-check file ==="
fd -t f 'openwisp-qa-check' 2>/dev/null || echo "Not found with fd"
find . -name 'openwisp-qa-check' -type f 2>/dev/null || echo "Not found with find"

echo -e "\n=== Checking for runtests file ==="
fd -t f 'runtests' 2>/dev/null || echo "Not found with fd"
find . -name 'runtests' -type f 2>/dev/null || echo "Not found with find"

echo -e "\n=== Checking repository root structure ==="
ls -la . | head -20

Repository: openwisp/openwisp-utils

Length of output: 1488


🏁 Script executed:

#!/bin/bash
# Examine the ci_failure_bot.py file around the flagged lines

echo "=== Viewing ci_failure_bot.py lines 145-160 ==="
sed -n '145,160p' .github/scripts/ci_failure_bot.py

echo -e "\n=== Viewing ci_failure_bot.py lines 190-210 ==="
sed -n '190,210p' .github/scripts/ci_failure_bot.py

Repository: openwisp/openwisp-utils

Length of output: 2119


🏁 Script executed:

#!/bin/bash
# Check if there are any other references to master branch or these files in the codebase

echo "=== Searching for 'blob/master' references ==="
rg 'blob/master' --type py 2>/dev/null | head -20

echo -e "\n=== Searching for openwisp-qa-check references ==="
rg 'openwisp-qa-check' 2>/dev/null | head -20

echo -e "\n=== Searching for runtests references ==="
rg 'runtests' --type py 2>/dev/null | head -20

Repository: openwisp/openwisp-utils

Length of output: 1266


Consider making branch name and repository paths dynamic or configurable.

While openwisp-qa-check and runtests currently exist at the repository root and the project uses master as the default branch, the hardcoded branch name and paths in lines 154-155 create brittleness:

  • If the project switches to main (increasingly common), these URLs become invalid
  • If file locations or names change, the links become stale without code updates
  • The method accepts self.repository_name as a parameter, suggesting potential reuse across projects

Consider detecting the default branch dynamically via the GitHub API or making these paths configurable.

🤖 Prompt for AI Agents
In @.github/scripts/ci_failure_bot.py around lines 151 - 156, The hardcoded
branch and file paths in repo URL construction (variables repo_url,
qa_checks_url, runtests_url using self.repository_name and "master") make links
brittle; update the class to accept configurable branch and path names (e.g.,
add optional parameters/defaults for branch, qa_path, runtests_path) or detect
the repository default branch via the GitHub API and use that value when
building repo_url/qa_checks_url/runtests_url, falling back to a sensible
default; ensure updates are applied where project_name, repo_url, qa_checks_url,
and runtests_url are computed so links remain valid across repos and branch name
changes.

Comment on lines +176 to +180
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Hardcoded OpenWISP-specific paths limit reusability.

The URLs for openwisp-qa-check and runtests are hardcoded and assume a specific repository structure:

  • Line 154: /blob/master/openwisp-qa-check
  • Line 155: /blob/master/runtests

If this bot is intended exclusively for OpenWISP repositories, this is acceptable. However, if there's any intent to reuse this bot in other projects, consider:

  1. Making these paths configurable via environment variables
  2. Checking if files exist before including their URLs
  3. Making the prompt more generic

Based on the PR objectives mentioning "OpenWISP Utils" specifically, is this bot designed exclusively for OpenWISP repositories, or should it be more generic?

# Build the context string with proper line breaks
build_logs_json = json.dumps(build_logs, indent=2)
if pr_diff:
pr_diff_json = json.dumps(pr_diff, indent=2)
else:
pr_diff_json = "No PR associated"

# Gemini prompt - ignore line length for readability
context = f"""
### ROLE
You are the "Automated Maintainer Gatekeeper." Your goal is to analyze Pull Request (PR) build failures and provide direct, technically accurate, and no-nonsense feedback to contributors. # noqa: E501

### INPUT CONTEXT PROVIDED
1. **Build Output/Logs:** {build_logs_json}
2. **YAML Workflow:** {workflow_yaml or "Not available"}
3. **PR Diff:** {pr_diff_json}
4. **Project Name:** {project_name} # noqa: E501
5. **Repository:** {repo_url}
6. **run-qa-checks:** {qa_checks_url} # noqa: E501
7. **runtests:** {runtests_url} # noqa: E501

### TASK
Analyze the provided context to determine why the build failed. Categorize the failure and respond according to the "Tone Guidelines" below. # noqa: E501

### TONE GUIDELINES
- **Direct & Honest:** Do not use "fluff" or overly polite corporate language.
- **Firm Standards:** If a PR is low-effort, spammy, or fails to follow basic instructions, state that clearly.
- **Action-Oriented:** Provide the exact command or file change needed to fix the error, unless the PR is spammy, in which case we should just declare the PR as potential SPAM and ask maintainers to manually review it. # noqa: E501

### RESPONSE STRUCTURE
1. **Status Summary:** A one-sentence blunt assessment of the failure.
2. **Technical Diagnosis:**
- Identify the specific line/test that failed.
- Explain *why* it failed.
3. **Required Action:** Provide a code block or specific steps the contributor must take.
4. **Quality Warning (If Applicable):** If the PR appears to be "spam" (e.g., trivial README changes, AI-generated nonsense, or repeated basic errors), include a firm statement that such contributions are a drain on project resources and ping the maintainers asking them for manual review. # noqa: E501

### EXAMPLE RESPONSE STYLE
"The build failed because you neglected to update the test suite to match your logic changes. This project does not accept functional changes without corresponding test updates. Refer to the log at line 452. Update tests/logic_test.py before re-submitting. We prioritize high-quality, ready-to-merge code; please ensure you run local tests before pushing." # noqa: E501

Analyze the failure and provide your response:
"""
Comment on lines +189 to +222
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove noqa comments from the prompt string.

The prompt contains # noqa: E501 comments (lines 191, 197, 199, 200, 203, 208, 216, 219) intended for linters, but these will be included in the actual text sent to Gemini. These comments serve no purpose in the runtime prompt and should be removed.

🧹 Proposed fix to remove noqa comments
         context = f"""
 ### ROLE
-You are the "Automated Maintainer Gatekeeper." Your goal is to analyze Pull Request (PR) build failures and provide direct, technically accurate, and no-nonsense feedback to contributors.  # noqa: E501
+You are the "Automated Maintainer Gatekeeper." Your goal is to analyze Pull Request (PR) build failures and provide direct, technically accurate, and no-nonsense feedback to contributors.

 ### INPUT CONTEXT PROVIDED
 1. **Build Output/Logs:** {build_logs_json}
 2. **YAML Workflow:** {workflow_yaml or "Not available"}
 3. **PR Diff:** {pr_diff_json}
-4. **Project Name:** {project_name}  # noqa: E501
+4. **Project Name:** {project_name}
 5. **Repository:** {repo_url}
-6. **run-qa-checks:** {qa_checks_url}  # noqa: E501
-7. **runtests:** {runtests_url}  # noqa: E501
+6. **run-qa-checks:** {qa_checks_url}
+7. **runtests:** {runtests_url}

 ### TASK
-Analyze the provided context to determine why the build failed. Categorize the failure and respond according to the "Tone Guidelines" below.  # noqa: E501
+Analyze the provided context to determine why the build failed. Categorize the failure and respond according to the "Tone Guidelines" below.

 ### TONE GUIDELINES
 - **Direct & Honest:** Do not use "fluff" or overly polite corporate language.
 - **Firm Standards:** If a PR is low-effort, spammy, or fails to follow basic instructions, state that clearly.
-- **Action-Oriented:** Provide the exact command or file change needed to fix the error, unless the PR is spammy, in which case we should just declare the PR as potential SPAM and ask maintainers to manually review it.  # noqa: E501
+- **Action-Oriented:** Provide the exact command or file change needed to fix the error, unless the PR is spammy, in which case we should just declare the PR as potential SPAM and ask maintainers to manually review it.

 ### RESPONSE STRUCTURE
 1. **Status Summary:** A one-sentence blunt assessment of the failure.
 2. **Technical Diagnosis:**
    - Identify the specific line/test that failed.
    - Explain *why* it failed.
 3. **Required Action:** Provide a code block or specific steps the contributor must take.
-4. **Quality Warning (If Applicable):** If the PR appears to be "spam" (e.g., trivial README changes, AI-generated nonsense, or repeated basic errors), include a firm statement that such contributions are a drain on project resources and ping the maintainers asking them for manual review.  # noqa: E501
+4. **Quality Warning (If Applicable):** If the PR appears to be "spam" (e.g., trivial README changes, AI-generated nonsense, or repeated basic errors), include a firm statement that such contributions are a drain on project resources and ping the maintainers asking them for manual review.

 ### EXAMPLE RESPONSE STYLE
-"The build failed because you neglected to update the test suite to match your logic changes. This project does not accept functional changes without corresponding test updates. Refer to the log at line 452. Update tests/logic_test.py before re-submitting. We prioritize high-quality, ready-to-merge code; please ensure you run local tests before pushing."  # noqa: E501
+"The build failed because you neglected to update the test suite to match your logic changes. This project does not accept functional changes without corresponding test updates. Refer to the log at line 452. Update tests/logic_test.py before re-submitting. We prioritize high-quality, ready-to-merge code; please ensure you run local tests before pushing."

 Analyze the failure and provide your response:
 """
🤖 Prompt for AI Agents
In @.github/scripts/ci_failure_bot.py around lines 189 - 222, The prompt string
assigned to the variable context in .github/scripts/ci_failure_bot.py includes
inline linter comments "# noqa: E501" which will be sent to the LLM; remove all
occurrences of these comments from the f-string so the runtime prompt is clean.
Edit the context f-string (the multi-line prompt) to delete every "# noqa: E501"
token (e.g., after the ROLE line and any lines referencing project_name,
repo_url, qa_checks_url, runtests_url, and RESPONSE STRUCTURE) so the content
sent to Gemini contains only the intended prompt text.


try:
# Use new Gemini client API
from google.genai import types

response = self.client.models.generate_content(
model=self.model_name, contents=types.Part.from_text(context)
)
return response.text
except (ValueError, ConnectionError, Exception) as e:
print(f"Error calling Gemini API: {e}")
return self.fallback_response()
Comment on lines +172 to +234
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's explore the structure and find relevant files
git ls-files | head -20

Repository: openwisp/openwisp-utils

Length of output: 591


🏁 Script executed:

# Look at the CI failure bot file and surrounding context
cd .github/scripts && ls -la

Repository: openwisp/openwisp-utils

Length of output: 248


🏁 Script executed:

# Check if there's a README or documentation about the bot
fd -e md -e txt | xargs grep -l "gemini\|ai\|privacy\|security" 2>/dev/null | head -10

Repository: openwisp/openwisp-utils

Length of output: 189


🏁 Script executed:

# Search for privacy-related discussions in the codebase
rg -i "privacy|security|data|sanitize" --type py --max-count=20

Repository: openwisp/openwisp-utils

Length of output: 8322


🏁 Script executed:

# Look for the run method mentioned in the review (line 301)
rg "def run\|fork.*pr" .github/scripts/ci_failure_bot.py -A 5 -B 2

Repository: openwisp/openwisp-utils

Length of output: 49


🏁 Script executed:

# Check for fallback_response implementation
rg "def fallback_response" .github/scripts/ci_failure_bot.py -A 10

Repository: openwisp/openwisp-utils

Length of output: 448


🏁 Script executed:

# Look for the run method and fork PR detection
rg "def run|is_fork" .github/scripts/ci_failure_bot.py -B 2 -A 10

Repository: openwisp/openwisp-utils

Length of output: 554


🏁 Script executed:

# Check for environment variables and configuration
rg "GITHUB_TOKEN|GEMINI|API|environ" .github/scripts/ci_failure_bot.py -B 2 -A 2

Repository: openwisp/openwisp-utils

Length of output: 1754


🏁 Script executed:

# Look at the full file to understand the structure
wc -l .github/scripts/ci_failure_bot.py

Repository: openwisp/openwisp-utils

Length of output: 104


🏁 Script executed:

# Check the workflow file to understand how it's triggered
cat .github/workflows/ci-failure-bot.yml

Repository: openwisp/openwisp-utils

Length of output: 1652


🏁 Script executed:

# Look for any issues, discussions, or PR comments about privacy/security
rg -i "privacy|security|gemini|data sharing|external" . --type md --type rst --type txt

Repository: openwisp/openwisp-utils

Length of output: 1932


🏁 Script executed:

# Check if there's any log sanitization or filtering
rg "filter|sanitize|mask|redact|secret|token" .github/scripts/ci_failure_bot.py -B 2 -A 2

Repository: openwisp/openwisp-utils

Length of output: 1192


🏁 Script executed:

# Look at the beginning of the file to understand initialization and API key handling
head -100 .github/scripts/ci_failure_bot.py

Repository: openwisp/openwisp-utils

Length of output: 3938


🏁 Script executed:

# Get the full run method to see fork PR detection
sed -n '265,330p' .github/scripts/ci_failure_bot.py

Repository: openwisp/openwisp-utils

Length of output: 2765


🏁 Script executed:

# Look for fork-related code
rg "fork\|is_fork\|event.workflow_run" .github/scripts/ci_failure_bot.py -B 2 -A 5

Repository: openwisp/openwisp-utils

Length of output: 49


🏁 Script executed:

# Check the get_pr_diff method to see if there's any fork handling
rg "def get_pr_diff" .github/scripts/ci_failure_bot.py -A 30

Repository: openwisp/openwisp-utils

Length of output: 1198


🏁 Script executed:

# Look at the documentation files mentioned
cat docs/user/ci-failure-bot.rst | head -80

Repository: openwisp/openwisp-utils

Length of output: 2496


🏁 Script executed:

# Check if there's any rate limiting or cost monitoring
rg "rate|limit|cost|quota" .github/scripts/ci_failure_bot.py -B 2 -A 2

Repository: openwisp/openwisp-utils

Length of output: 1430


🏁 Script executed:

# Get the context where Gemini is called to see the full data being sent
sed -n '172,240p' .github/scripts/ci_failure_bot.py

Repository: openwisp/openwisp-utils

Length of output: 3596


Fork PR security check is already implemented; privacy implications are documented.

Fork PRs are explicitly skipped (lines 287-293) with a comment confirming this is for security to avoid sending external code to AI. Privacy considerations are also already documented in docs/user/ci-failure-bot.rst, which warns that "PR diffs and build logs are sent to Google's Gemini AI service" and recommends organizations with sensitive codebases be aware of this.

Remove unnecessary noqa comments from the Gemini prompt.

Lines 189–222 include # noqa: E501 comments within the prompt string sent to Gemini. These should be removed since they're runtime data, not code being linted, and serve no purpose in the API request.

Fix hard-coded master branch reference and broad exception catch.

  • Lines 176–177 assume the repository uses a master branch for the URLs, which may not match if the default branch differs.
  • Line 232 catches the broad Exception class instead of specific exception types, which bypasses linter checks.
🧰 Tools
🪛 Ruff (0.14.10)

231-231: Consider moving this statement to an else block

(TRY300)


232-232: Do not catch blind exception: Exception

(BLE001)


def fallback_response(self):
"""Fallback response if Gemini fails"""
return """
## CI Build Failed

The automated analysis is temporarily unavailable. Please check the CI logs above for specific error details.

Common fixes:
- Run `openwisp-qa-format` for code style issues
- Run `./runtests` locally to debug test failures
- Check dependencies for setup issues

See: https://openwisp.io/docs/dev/developer/contributing.html
"""

def post_comment(self, message):
"""Post or update comment on PR"""
if not self.pr_number or self.pr_number.strip() == "":
print("No PR number, skipping comment")
return

# Add consistent marker for deduplication
marker = "<!-- ci-failure-bot-comment -->"
message_with_marker = f"{marker}\n{message}"

try:
pr_num = int(self.pr_number)
pr = self.repo.get_pull(pr_num)

# Check for existing bot comments to avoid duplicates
bot_login = self.github.get_user().login
existing_comments = pr.get_issue_comments()

for comment in existing_comments:
if comment.user.login == bot_login and marker in comment.body:
print("Bot comment already exists, updating it")
comment.edit(message_with_marker)
return

# No existing comment, create new one
pr.create_issue_comment(message_with_marker)
print(f"Posted comment to PR #{pr_num}")
except (GithubException, ValueError) as e:
print(f"Error posting comment: {e}")

def run(self):
"""Main execution flow"""
try:
print("CI Failure Bot starting - AI-powered analysis")

# Security checks: Skip if this is a dependabot PR or fork PR
try:
workflow_run = self.repo.get_workflow_run(self.workflow_run_id)
if (
workflow_run.actor
and "dependabot" in workflow_run.actor.login.lower()
):
print(f"Skipping dependabot PR from {workflow_run.actor.login}")
return

# Skip fork PRs for security (avoid sending external code to AI)
if self.pr_number and self.pr_number.strip():
try:
pr_num = int(self.pr_number)
pr = self.repo.get_pull(pr_num)
if pr.head.repo.full_name != self.repository_name:
print(f"Skipping fork PR from {pr.head.repo.full_name}")
return
except (GithubException, ValueError) as e:
print(f"Warning: Could not check fork status: {e}")

Comment on lines +296 to +306
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Handle deleted fork repositories gracefully.

If a PR's head repository has been deleted (common with forks), pr.head.repo will be None, causing pr.head.repo.full_name on line 301 to raise an AttributeError. While the exception is caught generically, it's better to check explicitly.

🛡️ Proposed fix to check for deleted repos
                 if self.pr_number and self.pr_number.strip():
                     try:
                         pr_num = int(self.pr_number)
                         pr = self.repo.get_pull(pr_num)
-                        if pr.head.repo.full_name != self.repository_name:
-                            print(f"Skipping fork PR from {pr.head.repo.full_name}")
+                        if pr.head.repo and pr.head.repo.full_name != self.repository_name:
+                            print(f"Skipping fork PR from {pr.head.repo.full_name}")
+                            return
+                        elif not pr.head.repo:
+                            print("Skipping PR with deleted head repository")
                             return
                     except (GithubException, ValueError) as e:
                         print(f"Warning: Could not check fork status: {e}")

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In @.github/scripts/ci_failure_bot.py around lines 296 - 306, The current fork
check accesses pr.head.repo.full_name which will raise when pr.head.repo is None
for deleted fork repositories; update the block that handles self.pr_number/pr
to explicitly check if pr.head.repo is None before accessing full_name and treat
that case as a deleted fork (print a clear message and return/skip), otherwise
compare pr.head.repo.full_name to self.repository_name as before; keep the
existing except (GithubException, ValueError) handling for API errors but avoid
relying on it for the deleted-repo case by performing the explicit None check on
pr.head.repo.

except (GithubException, AttributeError) as e:
print(f"Warning: Could not check actor: {e}")

# Get all context
build_logs = self.get_build_logs()
pr_diff = self.get_pr_diff()
workflow_yaml = self.get_workflow_yaml()

if not build_logs:
print("No build logs found")
return

print("Analyzing failure with Gemini AI...")

# Get AI analysis
ai_response = self.analyze_with_gemini(build_logs, pr_diff, workflow_yaml)

# Post intelligent comment
self.post_comment(ai_response)

print("CI Failure Bot completed successfully")

except Exception as e:
print(f"CRITICAL ERROR in CI Failure Bot: {e}")
print(f"Error type: {type(e).__name__}")
import traceback

traceback.print_exc()
sys.exit(1)


if __name__ == "__main__":
try:
bot = CIFailureBot()
bot.run()
except Exception as e:
print(f"FATAL: CI Failure Bot crashed: {e}")
import traceback

traceback.print_exc()
sys.exit(1)
51 changes: 51 additions & 0 deletions .github/workflows/ci-failure-bot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
name: CI Failure Bot

on:
workflow_run:
workflows: ["OpenWISP Utils CI Build"]
types:
- completed

permissions:
issues: write
pull-requests: write
contents: read

jobs:
ci-failure-bot:
runs-on: ubuntu-latest
if: ${{ github.event.workflow_run.conclusion == 'failure' && !contains(github.event.workflow_run.actor.login, 'dependabot') }}

steps:
- name: Checkout repository
uses: actions/checkout@v6

- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: "3.11"

- name: Install dependencies
# Install minimum required versions
run: |
pip install requests>=2.32.5 PyGithub>=2.0.0 google-genai>=0.2.0
Comment on lines +29 to +32
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

latest version requests python pip package 2025

💡 Result:

As of January 9, 2026 the latest released version of the requests PyPI package is 2.32.5 (uploaded Aug 18, 2025). [1][2]

Sources:

  • PyPI requests project (release files and upload date). [1]
  • Safety DB listing (latest version 2.32.5). [2]

🌐 Web query:

latest version PyGithub python pip package 2025

💡 Result:

The latest PyPI release of PyGithub is 2.8.1 (uploaded Sep 2, 2025). [1][2]

References:
[1] PyPI project page — PyGithub (pygithub 2.8.1).
[2] PyGithub changelog / documentation (version 2.8.1).


🌐 Web query:

latest version google-genai python pip package 2025

💡 Result:

The latest google-genai PyPI release in 2025 was version 1.8.0 (uploaded March 26, 2025). [1]

Sources:

  • PyPI release page for google-genai (1.8.0). [1]

🌐 Web query:

security vulnerabilities requests 2.32.5 PyGithub 2.0.0 google-genai 0.2.0

💡 Result:

Summary — current public info (searched Jan 9, 2026):

  • requests 2.32.5 — No known unpatched CVEs in 2.32.5; vulnerabilities like CVE-2024-35195 (Session verify=False behavior) and CVE-2024-47081 were fixed in the 2.32.x line (upgrade to ≥2.32.0/2.32.4+ as advised). [1][2][3]

  • PyGithub 2.0.0 — I found no public CVE/official security advisory specifically affecting PyGithub 2.0.0; major vulnerability databases list no direct CVEs for current PyGithub releases (still recommend using a maintained/latest release). [4][5]

  • google-genai 0.2.0 — I found no public CVEs or security advisories tied to version 0.2.0 in major vulnerability databases (no direct vulnerabilities reported), but the package has had many subsequent releases — consider upgrading to a maintained release. [6][7]

Recommended action: if you run these versions in production, update to the latest stable releases (or at least the minimum patched releases noted above for requests) and rescans your dependencies with your SCA tool of choice.

Sources:

  1. Rapid7 / CVE-2024-35195 (requests fix notes).
  2. Snyk / requests package page (no direct vulnerabilities in 2.32.5).
  3. Safety DB / requests vulnerabilities & release notes.
  4. Snyk / PyGithub package page (no direct vulnerabilities listed).
  5. PyGithub changelog / releases.
  6. Snyk / @google/genai / google-genai entries (no direct vulnerabilities listed).
  7. PyPI google-genai project page.

Update minimum dependency versions to more recent releases.

While the specified versions are free from known unpatched vulnerabilities, the minimum versions for PyGithub and google-genai are significantly outdated:

  • requests>=2.32.5 is appropriate (latest stable as of January 2026)
  • PyGithub>=2.0.0 should be PyGithub>=2.8.1 (latest as of September 2025)
  • google-genai>=0.2.0 should be google-genai>=1.8.0 (latest as of March 2025)

Updating these minimums ensures the CI workflow uses current, well-maintained releases.

🤖 Prompt for AI Agents
In @.github/workflows/ci-failure-bot.yml around lines 29 - 32, Update the pip
install line in the "Install dependencies" step so the minimum package versions
are bumped: keep requests>=2.32.5 but change PyGithub>=2.0.0 to PyGithub>=2.8.1
and google-genai>=0.2.0 to google-genai>=1.8.0; edit the run block that contains
the pip install command to reflect these new version specifiers so the CI uses
the more recent minimum releases.


- name: Run CI Failure Bot
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
WORKFLOW_RUN_ID: ${{ github.event.workflow_run.id }}
REPOSITORY: ${{ github.repository }}
PR_NUMBER: ${{ github.event.workflow_run.pull_requests[0].number || '' }}
run: |
set -e # Exit immediately if any command fails
echo "Starting CI Failure Bot..."
python .github/scripts/ci_failure_bot.py
exit_code=$?
if [ $exit_code -ne 0 ]; then
echo "ERROR: CI Failure Bot failed with exit code $exit_code"
echo "This indicates a critical issue with the bot logic that needs immediate attention"
exit $exit_code
fi
Comment on lines +42 to +50
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unreachable error handling code due to set -e.

With set -e on line 42, the shell will exit immediately if python .github/scripts/ci_failure_bot.py fails, so lines 45-50 will never execute. Either remove set -e to allow capturing the exit code, or remove the redundant error handling.

Suggested fix: Simplify by relying on set -e
       run: |
         set -e  # Exit immediately if any command fails
         echo "Starting CI Failure Bot..."
         python .github/scripts/ci_failure_bot.py
-        exit_code=$?
-        if [ $exit_code -ne 0 ]; then
-          echo "ERROR: CI Failure Bot failed with exit code $exit_code"
-          echo "This indicates a critical issue with the bot logic that needs immediate attention"
-          exit $exit_code
-        fi
         echo "CI Failure Bot completed successfully"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
set -e # Exit immediately if any command fails
echo "Starting CI Failure Bot..."
python .github/scripts/ci_failure_bot.py
exit_code=$?
if [ $exit_code -ne 0 ]; then
echo "ERROR: CI Failure Bot failed with exit code $exit_code"
echo "This indicates a critical issue with the bot logic that needs immediate attention"
exit $exit_code
fi
set -e # Exit immediately if any command fails
echo "Starting CI Failure Bot..."
python .github/scripts/ci_failure_bot.py
echo "CI Failure Bot completed successfully"
🤖 Prompt for AI Agents
In @.github/workflows/ci-failure-bot.yml around lines 42 - 50, The CI script
uses set -e which causes the shell to exit immediately if python
.github/scripts/ci_failure_bot.py fails, making the subsequent exit_code capture
and the if-block (the echo "ERROR: CI Failure Bot failed..." and exit
$exit_code) unreachable; fix by either removing set -e so you can run the python
command, capture its exit code in exit_code and keep the existing if-block, or
keep set -e and delete the redundant exit_code capture and if-block (or replace
it with a simple command chaining strategy to handle errors explicitly).

echo "CI Failure Bot completed successfully"
Loading
Loading