Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 26, 2026

Fix PR Triage Agent Workflow Failure - Safe Outputs MCP Server Startup Timeout

Problem Summary

The PR Triage Agent workflow fails because the Safe Outputs MCP HTTP server doesn't start within the timeout period. The server process starts but produces no log output and never responds on port 3001.

Failing workflow: https://github.com/githubnext/gh-aw/actions/runs/21368688445

Investigation Complete ✅

Key Findings:

  1. ✅ All required files are copied correctly to /opt/gh-aw/safeoutputs/
  2. ✅ Node.js process starts successfully (PID confirmed alive)
  3. Server works perfectly in local test environment
  4. ✅ Issue is environment-specific to GitHub Actions workflows
  5. ✅ Comprehensive debug logging now in place with proper logger

Changes Made

1. Debug Logging with Proper Logger

Implemented comprehensive debug logging using the createLogger() utility at critical initialization points:

  • Entry Point (safe-outputs-mcp-server.cjs): Uses createLogger("safe-outputs-entry") to track Node.js execution and module loading
  • HTTP Server (safe_outputs_mcp_server_http.cjs): Uses module logger and createLogger("safe-outputs-startup") to log each require() statement and function entry
  • Transport Layer (mcp_http_transport.cjs): Uses createLogger("MCPServer") and createLogger("MCPHTTPTransport") to track connect() and start() async operations
  • MCP Core (mcp_server_core.cjs): Uses module logger to confirm module initialization

All log messages include:

  • ISO timestamps: [2026-01-26T20:29:28.069Z]
  • Logger names: [safe-outputs-entry], [safeoutputs], [MCPServer], etc.
  • Consistent formatting across all modules

2. Timeout Increased

  • Increased startup timeout from 10 seconds to 60 seconds in start_safe_outputs_server.sh
  • Updated all timeout messages to reflect 60-second limit
  • Updated progress messages to show attempt X/60 instead of attempt X/10

3. Workflow Updates

  • Merged latest changes from main branch (twice - staying up to date)
  • Recompiled all workflow files with make recompile
  • Formatted code with make fmt
  • Linted code with make lint
  • All validations passing

Local Test Results ✅

Successfully tested with exact workflow file setup and new logger:

✅ Server starts in < 0.01 seconds
✅ All 33 modules load successfully
✅ HTTP server binds to port 3001
✅ Health endpoint responds correctly  
✅ Logging captured with >> redirection
✅ All async operations complete properly
✅ Logger provides consistent timestamps and formatting
✅ All log messages properly categorized by component

Next Steps

To Complete Fix:

  1. ⏳ Trigger test workflow run (PR Triage Agent or smoke-copilot)
  2. ⏳ Examine debug logs from workflow environment
  3. ⏳ Identify environment-specific difference
  4. ⏳ Apply targeted fix based on log comparison

Expected Debug Output:
The comprehensive logging with proper logger combined with increased timeout will reveal exactly where execution stops or hangs, enabling a precise fix.

Files Modified

  • actions/setup/js/safe-outputs-mcp-server.cjs - Entry point logging with createLogger
  • actions/setup/js/safe_outputs_mcp_server_http.cjs - Module loading and function call logging with createLogger
  • actions/setup/js/mcp_http_transport.cjs - Connect and start logging with createLogger
  • actions/setup/js/mcp_server_core.cjs - Module load logging with createLogger
  • actions/setup/sh/start_safe_outputs_server.sh - Increased timeout from 10s to 60s
  • Merged latest changes from main branch (multiple times)
  • Recompiled workflow lock files with updated action versions
Original prompt

This section details on the original issue you should resolve

<issue_title>[agentics] PR Triage Agent failed</issue_title>
<issue_description>### Workflow Failure

Workflow: PR Triage Agent
Branch: main
Run URL: https://github.com/githubnext/gh-aw/actions/runs/21368688445

Action Required

Debug this workflow failure using the agentic-workflows agent:

/agent agentic-workflows

When prompted, instruct the agent to debug this workflow failure.

Generated from PR Triage Agent

  • expires on Feb 2, 2026, 6:13 PM UTC

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.


Changeset

  • Type: patch
  • Description: Add diagnostic logging and expand the safe-outputs MCP server startup timeout to 60 seconds to reduce CI flakiness.

Ahoy! This treasure was crafted by 🏴‍☠️ Changeset Generator


Changeset

  • Type: patch
  • Description: Add diagnostic logging and widen the safe-outputs MCP server startup timeout to 60 seconds to tame CI flakiness.

Ahoy! This treasure was crafted by 🏴‍☠️ Changeset Generator

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
@pelikhan
Copy link
Contributor

@copilot increase timeout to 60 seconds

Copilot AI and others added 4 commits January 26, 2026 19:45
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Debug PR Triage Agent workflow failure Add diagnostic logging for safe-outputs MCP server startup failures Jan 26, 2026
Copilot AI requested a review from pelikhan January 26, 2026 20:00
@pelikhan
Copy link
Contributor

@copilot revert all *.lock.yml files

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 26, 2026

@copilot increase timeout to 60 seconds

Increased timeout from 10 to 60 seconds in start_safe_outputs_server.sh. Commit: 34907fc

Copilot AI changed the title Add diagnostic logging for safe-outputs MCP server startup failures Add diagnostic logging and increase timeout for safe-outputs MCP server startup failures Jan 26, 2026
@pelikhan
Copy link
Contributor

@copilot merge main, recompile, format, lint

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 26, 2026

@copilot revert all *.lock.yml files

Reverted all 139 *.lock.yml files to their original state. Commit: 73654be

@pelikhan pelikhan marked this pull request as ready for review January 26, 2026 20:14
@pelikhan pelikhan added the smoke label Jan 26, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Jan 26, 2026

🎬 THE ENDSmoke Claude MISSION: ACCOMPLISHED! The hero saves the day! ✨

@github-actions
Copy link
Contributor

github-actions bot commented Jan 26, 2026

✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟

@pelikhan
Copy link
Contributor

@copilot create logger and use them instead of directly using process.stderr

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 26, 2026

@copilot create logger and use them instead of directly using process.stderr

Refactored all debug logging to use createLogger() instead of direct process.stderr.write() calls. All log messages now include ISO timestamps and consistent logger names. Commit: f71d27e

@github-actions
Copy link
Contributor

github-actions bot commented Jan 26, 2026

✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟

@github-actions
Copy link
Contributor

github-actions bot commented Jan 26, 2026

🎬 THE ENDSmoke Claude MISSION: ACCOMPLISHED! The hero saves the day! ✨

@github-actions
Copy link
Contributor

github-actions bot commented Jan 26, 2026

🎉 Yo ho ho! Changeset Generator found the treasure and completed successfully! ⚓💰

@github-actions
Copy link
Contributor

📰 BREAKING: Smoke Copilot is now investigating this pull request. Sources say the story is developing...

@github-actions github-actions bot removed the smoke label Jan 26, 2026
@github-actions
Copy link
Contributor

Tests: GitHub MCP ✅; gh CLI ✅; Serena ✅; Playwright ✅; Tavily ✅; File ✅; Bash ✅; Discussion ✅
Overall: PASS
Update workflows to actions/checkout v6
[instructions] Sync github-agentic-workflows.md with latest schema features
Add copilot-setup.md prompt for automated repository configuration
Update CLI versions: Copilot 0.0.395, GitHub MCP v0.30.1, Playwright 0.0.60, Gateway v0.0.80

AI generated by Smoke Codex

@github-actions
Copy link
Contributor

Smoke Test Results

PRs Tested:

Status:

  • ✅ GitHub MCP
  • ✅ Safe Inputs GH CLI
  • ✅ Serena MCP
  • ✅ Playwright
  • ✅ File Writing
  • ✅ Bash Tool
  • ✅ Discussion Interaction

Overall: ✅ PASS

@pelikhan @Copilot

AI generated by Smoke Copilot

@github-actions
Copy link
Contributor

📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤

@github-actions
Copy link
Contributor

Smoke Test Results - Claude Engine

Status: ✅ PASS (9/9 tests)

Merged PRs Reviewed:

Tests:
✅ GitHub MCP
✅ Safe Inputs GH CLI
✅ Serena MCP
✅ Make Build
✅ Playwright
✅ Tavily Search
✅ File Writing
✅ Bash Tool
✅ Discussion Interaction

Run: §21373219714

AI generated by Smoke Claude

@pelikhan
Copy link
Contributor

@copilot merge main and recompile

Copilot AI and others added 2 commits January 26, 2026 20:53
…ge-agent-failure-again

# Conflicts:
#	.github/workflows/go-pattern-detector.lock.yml
#	.github/workflows/super-linter.lock.yml
#	.github/workflows/unbloat-docs.lock.yml
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 26, 2026

@copilot merge main and recompile

Merged main and recompiled workflows. Resolved conflicts by accepting main branch changes. Commit: 32a004c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[agentics] PR Triage Agent failed

2 participants