Description
Version of Singularity:
What version of Singularity are you using? Run:
3.7.x
Actual behavior
It has been reported that a netpoll error panic can occur when running containers that have a large environment, on systems with a high load average. We've also had a report of what appears to be the same issue, triggered very infrequently (1 in a million executions scale) with a slightly different backtrace (it likely varies based on version of Go used to build Singularity).
Go is a language with a runtime that includes a parallelized garbage collector. The netpoll failure on the fd is related to a GC memory profiling operation.
When Singularity is advanced in the container setup process it must close any file descriptors that are not associated with the container, or the container setup process. The container environment is sourced by a Go embedded shell interpreter. While the Go runtime could trigger a GC / memory profiling cycle at various times, a large environment is more likely to trigger a GC cycle / GC memory profiling operation in this step.
There appears to be a conflict between Singularity identifying and closing file descriptors, and Go runtime GC operations that may occur. The Go GC cycle isn't completely 'halt the world', perhaps explaining why high load and large environment are required to trigger it, as they possibly delay the completion of the starter execution path as some of the GC stuff happens in parallel.
The following change has been observed to work around this, by disabling Go runtime GC for the short-lived starter process that closes fds and performs other final container setup:
Allow the starter to keep variables that allow us to tweak the Go runtime GC behavior for the starter process:
diff --git a/cmd/starter/c/starter.c b/cmd/starter/c/starter.c
index a99de873b..678a32e7f 100644
--- a/cmd/starter/c/starter.c
+++ b/cmd/starter/c/starter.c
@@ -975,6 +975,7 @@ static void cleanup_fd(fdlist_t *master, struct starter *starter) {
if ( starter->fds[i] == fd ) {
found = true;
/* set force close on exec */
+ debugf("Setting FD_CLOEXEC on starter fd %d\n", starter->fds[i]);
if ( fcntl(starter->fds[i], F_SETFD, FD_CLOEXEC) < 0 ) {
debugf("Can't set FD_CLOEXEC on file descriptor %d: %s\n", starter->fds[i], strerror(errno));
}
@@ -1128,9 +1129,17 @@ static void cleanenv(void) {
/*
* keep only SINGULARITY_MESSAGELEVEL for GO runtime, set others to empty
* string and not NULL (see issue #3703 for why)
+ *
+ * DCT - also keep any GOGC and GODEBUG vars for go runtime
+ * debugging purposes.
*/
for (e = environ; *e != NULL; e++) {
- if ( strncmp(MSGLVL_ENV "=", *e, sizeof(MSGLVL_ENV)) != 0 ) {
+ if ( strncmp(MSGLVL_ENV "=", *e, sizeof(MSGLVL_ENV)) == 0 ||
+ strncmp("GOGC" "=", *e, sizeof("GOGC")) == 0 ||
+ strncmp("GODEBUG" "=", *e, sizeof("GODEBUG")) == 0 ) {
+ debugf("Keeping env var %s\n", *e);
+ } else {
+ debugf("Clearing env var %s\n", *e);
*e = "";
}
}
Disable garbage collections altogether for starter by setting GOGC=off
(this is the fix/workaround)
Note the GODEBUG=
line allows finer grained debugging. In this example it's turning off GC mem profiling and instructing Go to print a trace of any GC operations. If the GOGC=off
is present this shouldn't do anything.
diff --git a/internal/pkg/util/starter/starter.go b/internal/pkg/util/starter/starter.go
index e87abead3..92cca08cf 100644
--- a/internal/pkg/util/starter/starter.go
+++ b/internal/pkg/util/starter/starter.go
@@ -89,6 +89,10 @@ func Exec(name string, config *config.Common, ops ...CommandOp) error {
if err := c.init(config, ops...); err != nil {
return fmt.Errorf("while initializing starter command: %s", err)
}
+ sylog.Debugf("Setting GOGC=off for starter")
+ c.env = append(c.env, "GOGC=off")
+ sylog.Debugf("Setting GODEBUG=memprofilerate=0,gctrace=1 for starter")
+ c.env = append(c.env, "GODEBUG=memprofilerate=0,gctrace=1")
err := unix.Exec(c.path, []string{name}, c.env)
return fmt.Errorf("while executing %s: %s", c.path, err)
}