Skip to content

Fix MCP Inspector workflow - "Start MCP gateway" failure (80% failure rate) #11433

@github-actions

Description

@github-actions

Problem

The MCP Inspector workflow is failing consistently with an 80% failure rate (8/10 recent runs failed). The workflow has been non-operational since 2026-01-05 (18 days ago).

Error Details

Failed step: Step 24 - "Start MCP gateway"
Last successful run: 2026-01-05
Recent failures:

Example run: §21148514645

Impact

  • MCP tooling inspection capabilities are offline
  • Cannot validate MCP server configurations
  • Affects workflow development and debugging

Root Cause Analysis

The failure occurs at the "Start MCP gateway" step, which suggests:

  1. Tavily MCP server configuration issue - The workflow uses Tavily MCP server (similar to Daily News which recently recovered)
  2. MCP Gateway connectivity problem - Server may be unreachable or misconfigured
  3. Missing or invalid secrets - May need TAVILY_API_KEY or other credentials

Related Context

Daily News workflow had similar TAVILY_API_KEY issue and recovered after the secret was added (2026-01-22). Research workflow is also failing with likely similar root cause.

Recommended Investigation Steps

  1. Check Tavily configuration:

    • Verify TAVILY_API_KEY secret is accessible
    • Review .github/workflows/shared/mcp/tavily.md configuration
    • Compare with Daily News configuration (now working)
  2. Review MCP Gateway logs:

    • Download artifacts from failed run: artifacts
    • Check /tmp/gh-aw/mcp-logs/ for error messages
    • Look for connection timeouts or authentication errors
  3. Test MCP Gateway startup:

    • Run workflow manually with increased logging
    • Check if issue is specific to scheduled runs or all triggers
    • Verify Docker container images are accessible
  4. Compare with working workflows:

    • Smoke Claude and Smoke Codex are healthy (90% success rate)
    • They don't use Tavily - check what MCP servers they use
    • Identify differences in configuration

Success Criteria

  • MCP Inspector workflow runs successfully
  • "Start MCP gateway" step completes without errors
  • Success rate returns to >80% over next 5 runs

Priority: P1 (High)

This workflow provides critical tooling inspection capabilities. Fix within 24 hours to restore MCP debugging functionality.

References:

AI generated by Workflow Health Manager - Meta-Orchestrator

  • expires on Jan 24, 2026, 2:59 AM UTC

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions