-
Notifications
You must be signed in to change notification settings - Fork 8
refactor(sshtunnel): ExecuteCommand to use options struct #452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
📝 WalkthroughWalkthroughReplace positional sshtunnel.ExecuteCommand parameters with a new ExecuteCommandOptions struct; refactor sshtunnel internals (pipes, goroutines, backoff, agent forwarding, log streaming); update call sites in Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@pkg/devcontainer/sshtunnel/sshtunnel.go`:
- Around line 89-92: waitForTunnelCompletion may return an error from
TunnelServerFunc while goroutines are still blocked, causing wg.Wait to deadlock
because tc.cleanup (which cancels context/closes pipes) is deferred until
ExecuteCommand returns; modify ExecuteCommand to cancel the tunnel context
and/or call tc.cleanup immediately after receiving an error from
waitForTunnelCompletion (before calling wg.Wait) and ensure you still drain
errChan if needed so the goroutines can return and wg.Wait can complete.
Specifically, update the error path around the call to waitForTunnelCompletion
(reference symbols: waitForTunnelCompletion, TunnelServerFunc, wg.Wait, errChan,
tc.cleanup, ExecuteCommand) to invoke tc.cancel() or tc.cleanup() (and drain
errChan) prior to waiting on the WaitGroup.
- Around line 215-229: The goroutine uses p.opts.Ctx (the external context) when
calling AgentInject which ignores the internal cancellable context and can block
after siblings call cancel(); update executeSSHServerHelper to pass the
cancellable context instead. Add a cancellable context field to
sshServerHelperParams (e.g., cancelCtx or ctx), ensure the creator of
sshServerHelperParams sets it to the local cancelCtx produced alongside
cancel(), and call p.opts.AgentInject with that cancellable context (replace
p.opts.Ctx) so AgentInject sees internal cancellations; keep all other logic
(defer p.cancel(), error handling) unchanged.
- Around line 276-295: getSSHBackoff builds an exponential wait.Backoff using
ExecuteCommandOptions.SSHRetryInterval and SSHTimeout but omits a Cap, allowing
per-step sleeps to grow past the intended timeout and making total retry time
far larger; fix by setting wait.Backoff.Cap to a reasonable maximum (e.g., 5s or
configurable) so exponential growth is bounded, and adjust Steps calculation if
desired to reflect capped durations; update getSSHBackoff to populate the Cap
field on the returned wait.Backoff and document or make the cap configurable via
ExecuteCommandOptions if needed.
🧹 Nitpick comments (1)
pkg/devcontainer/sshtunnel/sshtunnel.go (1)
153-192:pipePair.Closedoesn't nil out fields after closing, allowing double-close on the same fd.
cleanup()is deferred inExecuteCommand, but the samepipePairfds could also be closed by goroutines or error paths. Closing an*os.Filetwice is benign (returns an error, which is discarded) unless the fd number is reused by the OS between the two closes, which would silently close an unrelated fd. Setting fields tonilafter close guards against this.Optional fix — nil-guard after close
func (p *pipePair) Close() { if p.stdoutReader != nil { _ = p.stdoutReader.Close() + p.stdoutReader = nil } if p.stdoutWriter != nil { _ = p.stdoutWriter.Close() + p.stdoutWriter = nil } if p.stdinReader != nil { _ = p.stdinReader.Close() + p.stdinReader = nil } if p.stdinWriter != nil { _ = p.stdinWriter.Close() + p.stdinWriter = nil } }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@pkg/devcontainer/sshtunnel/sshtunnel.go`:
- Around line 246-275: The session returned from waitForSSHSession in
runSSHTunnel is configured for agent forwarding via setupSSHAgentForwarding but
then discarded because devssh.Run (used inside runCommandInSSHTunnel) creates a
new session; fix by ensuring the forwarding is requested on the actual session
used to run the command: modify runCommandInSSHTunnel to accept the prepared
sess (or alternatively change runCommandInSSHTunnel to call
setupSSHAgentForwarding on the session created inside it) and then call
devssh.Run with that same sess so the RequestAgentForwarding applied in
setupSSHAgentForwarding takes effect; reference functions: runSSHTunnel,
waitForSSHSession, setupSSHAgentForwarding, runCommandInSSHTunnel, and
devssh.Run.
🧹 Nitpick comments (2)
pkg/devcontainer/sshtunnel/sshtunnel.go (2)
162-179:pipePair.Closeis not safe for concurrent calls.Setting fields to
nilafter close makes repeat calls safe within a single goroutine, but there's no synchronization. Currently this is fine becausecleanup()is only called once afterwg.Wait(), but it's fragile if the calling pattern changes.Optional: use sync.Once to make Close idempotent
type pipePair struct { + once sync.Once stdoutReader *os.File stdoutWriter *os.File stdinReader *os.File stdinWriter *os.File } func (p *pipePair) Close() { - if p.stdoutReader != nil { - _ = p.stdoutReader.Close() - p.stdoutReader = nil - } - if p.stdoutWriter != nil { - _ = p.stdoutWriter.Close() - p.stdoutWriter = nil - } - if p.stdinReader != nil { - _ = p.stdinReader.Close() - p.stdinReader = nil - } - if p.stdinWriter != nil { - _ = p.stdinWriter.Close() - p.stdinWriter = nil - } + p.once.Do(func() { + _ = p.stdoutReader.Close() + _ = p.stdoutWriter.Close() + _ = p.stdinReader.Close() + _ = p.stdinWriter.Close() + }) }
337-358: Agent forwarding failure is treated as fatal — is this intentional?
setupSSHAgentForwardinglogs a warning (Line 355) and returns the error (Line 357), which causesrunSSHTunnelto send toerrChanand abort (Lines 271-272). In many SSH workflows, agent forwarding is best-effort — it's useful when available but not critical for the tunnel itself.If this is intentional (e.g., required for downstream commands), a brief comment documenting the rationale would help. Otherwise, consider logging the warning and continuing:
Optional: make agent forwarding best-effort
if err != nil { tc.opts.Log.Warnf("SSH agent forwarding failed: %v", err) } - return err + return nil }
Signed-off-by: Samuel K <skevetter@pm.me>
Signed-off-by: Samuel K <skevetter@pm.me>
Signed-off-by: Samuel K <skevetter@pm.me>
dc9de0c to
110b248
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@pkg/devcontainer/sshtunnel/sshtunnel.go`:
- Around line 246-274: The SSH agent forwarding is being requested on the
session returned by waitForSSHSession and setupSSHAgentForwarding, but
devssh.Run (used inside runCommandInSSHTunnel) creates its own session so the
forwarding is never applied; fix by ensuring the session that has
RequestAgentForwarding is the one used for command execution—either change
runCommandInSSHTunnel/devssh.Run to accept and use the pre-configured sess
returned by waitForSSHSession (and remove the unused deferred sess close there),
or modify devssh.Run to call RequestAgentForwarding on its internal session when
tc indicates agent forwarding is desired; update runSSHTunnel to pass the
pre-configured session (sess) into runCommandInSSHTunnel, or implement the
RequestAgentForwarding call inside devssh.Run accordingly.
🧹 Nitpick comments (3)
pkg/devcontainer/sshtunnel/sshtunnel.go (3)
155-198:pipePair.Closeis not safe for concurrent use.The nil-check-then-close-then-nil-set pattern on lines 163–178 is not atomic. If
Close()were called from two goroutines simultaneously, both could pass the nil check before either sets the pointer to nil, causing a double-close on the underlying*os.File.In the current code,
cleanup()is only called once (deferred inExecuteCommand), so this isn't actively exploitable. But given thatClose()is exported on a public-ish type and explicitly tries to be idempotent, consider adding async.Onceto make it truly safe.
337-358: Agent forwarding failure is treated as fatal — is this intentional?If
ForwardToRemoteorRequestAgentForwardingfails, the error propagates back torunSSHTunnel(line 270–272), which sends it toerrChanand returns — meaning the tunnel command never runs. Agent forwarding is typically a nice-to-have; killing the entire operation for it may be too aggressive, especially when the user has no SSH agent running.Currently, the function logs a warning (line 355) and returns the error. Consider returning
nilafter the warning to allow the command to proceed without agent forwarding.Proposed fix — downgrade to warning-only
if err != nil { tc.opts.Log.Warnf("SSH agent forwarding failed: %v", err) } - return err + return nil
304-335: Context-canceled errors are sent toerrChanAND returned — caller must not double-send.Lines 325–326 send the error to
errChanand return it. The callerrunSSHTunnel(line 262–263) checkserr != niland returns without sending toerrChanagain. This is correct, but the dual responsibility (callee sends to channel, caller checks return value) is a subtle contract.Consider documenting the invariant that
waitForSSHSessionsends its own error toerrChanon failure, so future maintainers ofrunSSHTunneldon't accidentally double-send.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@pkg/devcontainer/sshtunnel/sshtunnel.go`:
- Around line 334-355: In setupSSHAgentForwarding, always call
devsshagent.RequestAgentForwarding(sess) after attempting
devsshagent.ForwardToRemote(sshClient, identityAgent) instead of only when
ForwardToRemote fails; keep the existing Warnf logging for ForwardToRemote
errors and also log any error returned by RequestAgentForwarding, but do not
gate the call on the first call's success—this matches the pattern in
cmd/machine/ssh.go and ensures both the handler registration (ForwardToRemote)
and the session request (RequestAgentForwarding) are performed for tunnelContext
with ssh.Client and ssh.Session.
🧹 Nitpick comments (4)
pkg/ssh/helper.go (1)
179-193: Pre-existing goroutine leak insetupContextCancellation.The
exitchannel is created locally and only closed by the goroutine's owndefer. Thecase <-exit:branch in theselectcan never fire because nothing external signals it. If the context is never cancelled (e.g., command completes successfully before cancellation), this goroutine leaks until the process exits orctxis eventually cancelled by a parent.The typical fix is to return
exit(or adonechannel) and close it aftersess.Run()returns:Proposed fix
-func setupContextCancellation(ctx context.Context, sess *ssh.Session) error { +func setupContextCancellation(ctx context.Context, sess *ssh.Session) (func(), error) { if err := ctx.Err(); err != nil { - return fmt.Errorf("context already cancelled: %w", err) + return nil, fmt.Errorf("context already cancelled: %w", err) } exit := make(chan struct{}) go func() { defer close(exit) select { case <-ctx.Done(): _ = sess.Signal(ssh.SIGINT) case <-exit: } }() - return nil + return func() { close(exit) }, nil }Then in
Run:- if err := setupContextCancellation(opts.Context, sess); err != nil { + cleanup, err := setupContextCancellation(opts.Context, sess) + if err != nil { return err } + defer cleanup()Note:
close(exit)would be called twice (once by the caller, once by the goroutine's defer). Usesync.Onceor restructure to avoid the double-close panic.This is pre-existing, not introduced by this PR, so it can be deferred.
pkg/devcontainer/sshtunnel/sshtunnel.go (3)
129-153: Goroutine errors are silently dropped whenTunnelServerFuncfails.If
TunnelServerFuncreturns an error at line 131,waitForTunnelCompletionreturns immediately without drainingerrChan. Goroutine errors are lost. While this doesn't cause a deadlock (errChan has capacity 2 andtc.cancel()at line 91 unblocks goroutines), it may complicate debugging since the root cause could be a goroutine failure that triggered theTunnelServerFuncfailure.Consider draining
errChanon the early-return path or logging whatever errors are available:Proposed improvement
result, err := tc.opts.TunnelServerFunc(tc.cancelCtx, tc.grpcPipes.stdinWriter, tc.grpcPipes.stdoutReader) if err != nil { + tc.opts.Log.Debugf("tunnel server failed, draining goroutine errors") return nil, fmt.Errorf("start tunnel server: %w", err) }Or more thoroughly, drain after cancel:
// In ExecuteCommand, after tc.cancel() and wg.Wait(): close(tc.errChan) // safe after wg.Wait() for e := range tc.errChan { if e != nil { opts.Log.Debugf("goroutine error (after tunnel server failure): %v", e) } }
155-179:pipePair.Close()is not concurrency-safe.
Close()reads and writes fields without synchronization. This is fine in the current code becausecleanup()is only called afterwg.Wait()ensures all goroutines have exited. However, sincetc.cancel()is called beforewg.Wait(), goroutines that are shutting down could theoretically race with a future refactor that callsClose()earlier. Consider adding a brief comment noting the thread-safety assumption, or adding async.Onceguard.
279-299: Steps calculation is vastly over-provisioned but context timeout mitigates it.With defaults (
duration=500ms,timeout=60s),steps = max(10, 240) = 240. With exponential growth capped at 5s, the worst-case theoretical total exceeds 20 minutes. In practice, thecontext.WithTimeoutat line 306 bounds the actual retry window to 60s. The excess steps are harmless but could confuse future readers who expectstepsto approximate the timeout.Consider computing steps based on the capped duration to better reflect the intended window:
// After cap is reached, each step ≈ cap duration. // Approximate: initial steps + (remaining time / cap) steps := max(10, int(timeout / (5 * time.Second)) + 10)This is minor since the context provides the hard limit.
c96ae48 to
5287377
Compare
|
If you're new to commit signing, there are different ways to set it up: Sign commits with
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@pkg/devcontainer/sshtunnel/sshtunnel.go`:
- Around line 322-344: The setupSSHAgentForwarding function treats
devsshagent.ForwardToRemote errors as fatal and returns the error, which aborts
the tunnel; change this so ForwardToRemote failures are non-fatal: if
devsshagent.ForwardToRemote(sshClient, identityAgent) returns an error, call
tc.opts.Log.Warnf (include the error and socket) and skip agent-forward setup
(do not call devsshagent.RequestAgentForwarding), then continue and return nil
so the tunnel proceeds without agent forwarding; keep RequestAgentForwarding as
an error only when ForwardToRemote succeeded.
- Around line 252-260: After waitForSSHSession returns a sess, ensure the
session is always closed on all paths: either add a defer sess.Close()
immediately after waitForSSHSession(...) to guarantee cleanup and then remove
the duplicate close inside runCommandInSSHTunnel, or explicitly call
sess.Close() in the error path when setupSSHAgentForwarding(...) fails before
returning; reference the functions waitForSSHSession, setupSSHAgentForwarding,
runCommandInSSHTunnel and the sess.Close() call to locate and update the code
accordingly.
- Around line 160-171: The Close method's helper uses a value parameter so f =
nil is a no-op and the pipePair fields (stdoutReader, stdoutWriter, stdinReader,
stdinWriter) remain non-nil, making Close non-idempotent; change Close to close
each field only if non-nil and then set the struct field to nil directly (e.g.,
check p.stdoutReader != nil, call Close(), then assign p.stdoutReader = nil) for
all four fields so repeated Close() calls are safe and avoid double-closing file
descriptors.
🧹 Nitpick comments (2)
pkg/devcontainer/sshtunnel/sshtunnel.go (2)
355-363:devssh.Runalso receives the parent context instead ofcancelCtx.Line 357 passes
tc.opts.Ctx. Iftc.cancel()fires (e.g., sibling failure), the SSH session won't receiveSIGINTviasetupContextCancellationbecause it watches the wrong context. It will only terminate when pipes are closed, which may delay cleanup.For consistency with the tunnel's internal cancellation, use
tc.cancelCtx.Proposed fix
if err := devssh.Run(devssh.RunOptions{ - Context: tc.opts.Ctx, + Context: tc.cancelCtx, Client: sshClient, Command: tc.opts.Command,
270-290: Steps calculation still assumes constant-duration retries, leading to a ~20-minute effective timeout.With
duration=500ms,timeout=60s,Factor=1.5,Cap=5s: steps =max(10, 240) = 240. Most steps hit the 5s cap, so total retry time ≈ 240 × 5s = 1200s (20 minutes), far exceeding the intended 60s timeout. ThecancelCtxis created withWithCancel(notWithTimeout), so there's no hard deadline.Consider either adding a
context.WithTimeoutaround the backoff call or cappingstepstomax(10, int(timeout/cap)):Option: fix steps calculation for capped backoff
- steps := max(10, int(timeout/duration*2)) + steps := max(10, int(timeout/(5*time.Second))+1) // align with CapOr wrap with a timeout:
timeoutCtx, timeoutCancel := context.WithTimeout(tc.cancelCtx, getSSHTimeout(tc.opts)) defer timeoutCancel() // use timeoutCtx in ExponentialBackoffWithContext
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@pkg/devcontainer/sshtunnel/sshtunnel.go`:
- Around line 142-164: If TunnelServerFunc returns an error in ExecuteCommand,
we must still wait for the errgroup goroutines to exit to avoid leaking
resources and races on t.sshClient/t.sshSession. After calling
opts.TunnelServerFunc(...) and seeing a non-nil err, call g.Wait() (using gctx)
before returning the error so runSSHServerHelper and runSSHTunnel have exited;
keep the existing deferred t.Close() to cancel the context and close pipes, but
ensure g.Wait() is awaited even on TunnelServerFunc failure. Reference symbols:
ExecuteCommand, TunnelServerFunc, g (errgroup), gctx, runSSHServerHelper,
runSSHTunnel, and t.Close.
- Around line 261-288: In waitForSSHSession, the Backoff config (Duration:500ms,
Factor:1.5, Cap:5s, Steps:20) triggers k8s wait behavior that zeroes Steps once
the increment hits Cap, causing the loop to stop after ~6 attempts; fix by
either removing Cap so the backoff can grow through all 20 Steps or by
recalculating Steps to reflect the capped growth (e.g., set Steps to the number
of retries you actually want given the 5s Cap), and update the backoff variable
used by ExponentialBackoffWithContext accordingly (adjust the backoff literal in
waitForSSHSession).
🧹 Nitpick comments (2)
pkg/ssh/helper.go (1)
124-134:validate()requiresClienteven when a pre-createdSessionis supplied.When
opts.Session != nil,opts.Clientis never used insideRun(it's only used at line 150 forNewSession). A caller that already has a session would still need to pass aClientto satisfy validation. This is not a bug since current callers provide both, but it tightens the API unnecessarily.Proposed fix
func (opts *RunOptions) validate() error { if opts.Context == nil { return fmt.Errorf("context is required") } - if opts.Client == nil { + if opts.Client == nil && opts.Session == nil { return fmt.Errorf("SSH client is required") } if opts.Command == "" { return fmt.Errorf("command is required") } return nil }Also applies to: 142-158
pkg/devcontainer/sshtunnel/sshtunnel.go (1)
419-428:extractLogLevelwon't match"warning"— only"warn"is in the map.Some log frameworks emit
"warning"instead of"warn". If that applies to the remote output, those lines will silently fall back to debug level. Purely cosmetic since the fallback is safe.Optional: add "warning" alias
var logLevelMap = map[string]logrus.Level{ "debug": logrus.DebugLevel, "info": logrus.InfoLevel, "warn": logrus.WarnLevel, + "warning": logrus.WarnLevel, "error": logrus.ErrorLevel, "fatal": logrus.FatalLevel, }
ed8a62e to
4e59504
Compare
Signed-off-by: Samuel K <skevetter@pm.me>
Signed-off-by: Samuel K <skevetter@pm.me>
Signed-off-by: Samuel K <skevetter@pm.me>
Signed-off-by: Samuel K <skevetter@pm.me>
Summary by CodeRabbit
Chores
Refactor