Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 26, 2026

Gateway would hang indefinitely during initialization if any backend MCP server failed to respond, blocking all other backends from starting.

Changes

Configuration: Added startupTimeout field (default: 60s) to gateway config with package-level constants (DefaultStartupTimeout, DefaultToolTimeout, DefaultPort).

Timeout enforcement: Implemented goroutine + buffered channel pattern in GetOrLaunch() and GetOrLaunchForSession():

// Launch connection in goroutine with buffered channel to prevent leaks
resultChan := make(chan connectionResult, 1)
go func() {
    conn, err := mcp.NewConnection(l.ctx, cmd, args, env)
    resultChan <- connectionResult{conn, err}
}()

// Timeout via select instead of context.WithTimeout to preserve connection after startup
select {
case result := <-resultChan:
    return result.conn, result.err
case <-time.After(l.startupTimeout):
    return nil, fmt.Errorf("server startup timeout after %v", l.startupTimeout)
}

Graceful degradation: Failed backends log warnings and gateway continues initialization with remaining backends.

Technical decisions

  • Goroutine + time.After() instead of context.WithTimeout to avoid canceling connection after successful startup
  • Buffered channel (size 1) allows goroutine to complete without blocking if timeout occurs first
  • Enhanced error messages with actionable suggestions to increase timeout if needed

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • nonexistent.local
    • Triggering command: /tmp/go-build3377049250/b269/launcher.test /tmp/go-build3377049250/b269/launcher.test -test.testlogfile=/tmp/go-build3377049250/b269/testlog.txt -test.paniconexit0 -test.timeout=10m0s -test.v=true go user.email t[bot] (dns block)
    • Triggering command: /tmp/go-build481144272/b273/launcher.test /tmp/go-build481144272/b273/launcher.test -test.testlogfile=/tmp/go-build481144272/b273/testlog.txt -test.paniconexit0 -test.timeout=10m0s -test.v=true dleware.test /tmp/go-build3677275879/b097/ ortcfg.link . --gdwarf2 --64 GL0vaMStIAuGbawbr5/sWo5QoaFO32xMT3YHtp6/k1XMsJiotFONAaFTgZJc (dns block)
    • Triggering command: /tmp/go-build3502506537/b273/launcher.test /tmp/go-build3502506537/b273/launcher.test -test.testlogfile=/tmp/go-build3502506537/b273/testlog.txt -test.paniconexit0 -test.timeout=10m0s -test.v=true -lang=go1.25 docker-buildx x_amd64/vet se 7049250/b028/vetls-files x_amd64/cgo x_amd64/vet 791 6791 791 aw-mcpg/awmg se 7049250/b141/vet-c x_amd64/vet aw-mcpg/awmg (dns block)
  • this-host-does-not-exist-12345.com
    • Triggering command: /tmp/go-build3377049250/b278/mcp.test /tmp/go-build3377049250/b278/mcp.test -test.testlogfile=/tmp/go-build3377049250/b278/testlog.txt -test.paniconexit0 -test.timeout=10m0s -test.v=true g_.a --global rgo/bin/bash credential.helpe/opt/hostedtoolcache/go/1.25.6/x64/pkg/tool/linux_amd64/vet (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>gateway need to handle dangling mcp servers</issue_title>
<issue_description>See errors: githubnext/gh-aw#11814</issue_description>

Comments on the Issue (you are @copilot in this section)

@pelikhan https://github.com/githubnext/gh-aw/pull/11817

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 7 commits January 26, 2026 14:53
Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
Use goroutine with timeout channel instead of context.WithTimeout to prevent
connection from being canceled after successful startup.

Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
Eliminates code duplication by defining connectionResult once at the package
level instead of duplicating it in both GetOrLaunch() and GetOrLaunchForSession().
The buffered channel with size 1 allows the goroutine to complete even if
timeout occurs first, preventing goroutine leaks.

Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
Replace magic numbers with package-level constants (DefaultStartupTimeout,
DefaultToolTimeout, DefaultPort) to maintain consistency and avoid duplication.

Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix gateway to handle dangling mcp servers Add startup timeout to prevent gateway hanging on unresponsive MCP servers Jan 26, 2026
Copilot AI requested a review from lpcox January 26, 2026 15:13
@lpcox lpcox marked this pull request as ready for review January 26, 2026 15:39
@lpcox lpcox merged commit 25f602d into main Jan 26, 2026
3 checks passed
@lpcox lpcox deleted the copilot/fix-dangling-mcp-servers branch January 26, 2026 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gateway need to handle dangling mcp servers

2 participants