Description
Nomad uses go-plugin to spin up various plugins and auxiliary processes, and saw surprising (to us) behavior when host process dies in hashicorp/nomad#5598 .
Nomad uses go-plugin to spin up long-running plugins with lifecycle independent from host, to ease in-place upgrades and reconfiguration, and use the reattachment patterns ReattachConfig
supported by this.
However, we observe the following problems after host process is restarted:
-
The plugin gets a SIGPIPE signal upon the next log/Stdout/Stderr write operation. When the host (e.g. go-plugin client) process dies, Stdout/Stderr pipe closes and any write from plugin fails with
io.ErrClosedPipe
error, and the plugin receive SIGPIPE, typically killing it.- note that if plugin explicitly ignores
SIGPIPE
, hclog may panics on log write failure in https://github.com/hashicorp/go-hclog/blob/6907afbebd2eef854f0be9194eb79b0ba75d7b29/intlogger.go#L370-L373
- note that if plugin explicitly ignores
-
On successful reattachment by a restarted host process, stdout/stderr syncing is lost, and any plugin log lines to Stdout/Stderr are lost.
Nomad works around this by having a dedicated log file for the plugin and not writing to the plugin Stderr in hashicorp/nomad#5598 .
Ideally, go-plugin
can makes handling host process restarting and re-attaching better. One possibility might be using fifo files such that plugin can always write to it with some buffer, but this may require clever use of non-blocking flags (to ensure plugin can proceed when fifo buffer is full).