Skip to content

Build cancellation leaves orphaned QEMU processes: no timeout on state polling loop #6786

@kaovilai

Description

@kaovilai

Problem

When a podman build --platform <foreign-arch> is cancelled (Ctrl+C / SIGINT), QEMU user-mode emulation processes (qemu-*-static) spawned during RUN steps are left running as orphans. They get reparented to PID 1 and persist indefinitely.

This is caused by buildah's signal handling in run_common.go which has no timeout on its container state polling loop and no cgroup-level kill fallback.

Reproduction

Automated repro with GitHub Actions: https://github.com/kaovilai/qemu-build-hang-repro

The cleanup-gap.yml workflow demonstrates the issue:

  1. Starts podman build --platform linux/arm64 with a Containerfile that spawns long-running processes under QEMU emulation
  2. Sends SIGINT → SIGTERM → SIGKILL to the podman build process
  3. Checks for surviving processes afterward

Result: 5 orphaned qemu-aarch64-static processes found with PPid=1 (reparented to init).

Example output from CI:

PID 2670: /usr/bin/qemu-aarch64-static /bin/sh -c ...   PPid: 1  wchan: sigsuspend
PID 2683: /usr/bin/qemu-aarch64-static /bin/sleep 300   PPid: 2670  wchan: hrtimer_nanosleep
PID 2685: /usr/bin/qemu-aarch64-static /bin/sleep 300   PPid: 2670  wchan: hrtimer_nanosleep
PID 2687: /usr/bin/qemu-aarch64-static /bin/sleep 300   PPid: 2670  wchan: hrtimer_nanosleep
PID 2689: /usr/bin/qemu-aarch64-static /bin/sleep 300   PPid: 2670  wchan: hrtimer_nanosleep

Root Cause

Signal handler sends SIGKILL but never times out (run_common.go:656-705)

When SIGINT/SIGTERM is received, the signal handler sends SIGKILL to the container via the OCI runtime, then polls container state every 100ms with no deadline:

// line 659-664: send SIGKILL on any signal
go func() {
    for range interrupted {
        if err := kill("SIGKILL").Run(); err != nil {
            logrus.Errorf("%v sending SIGKILL", err)
        }
    }
}()

// line 665-705: poll state forever
for {
    select {
    case <-time.After(100 * time.Millisecond):
        stat := exec.Command(runtime, append(options.Args, "state", containerName)...)
        // checks if StateStopped — but never times out
    }
}

If the container processes don't respond to SIGKILL (e.g., QEMU in uninterruptible sleep / D state from QEMU #2738), this loop runs forever.

Parent process has no timeout either (run_common.go:1236-1244)

The parent process forwards signals to the child subprocess but blocks indefinitely on cmd.Wait() (~line 1297):

go func() {
    for receivedSignal := range interrupted {
        if err := cmd.Process.Signal(receivedSignal); err != nil {
            logrus.Infof("%v while attempting to forward %v to child process", err, receivedSignal)
        }
    }
}()

No cgroup-level cleanup

Even when individual processes don't respond to SIGKILL, the container's cgroup could be used to force-kill all processes. This fallback doesn't exist.

Impact

  • On Linux: orphaned QEMU processes consume resources until manually killed
  • On macOS (podman machine): orphans persist inside the VM with no host-side visibility — user must podman machine ssh to discover and kill them
  • In CI/CD: orphans can accumulate across builds, consuming runner resources

Suggested Fix

Add a timeout to the state polling loop with cgroup kill fallback:

deadline := time.After(30 * time.Second)
for {
    select {
    case <-deadline:
        logrus.Warnf("container %s did not stop after SIGKILL, force-killing cgroup", containerName)
        // Force-kill via cgroup as last resort
        cgroupKill(containerName)
        return
    case <-time.After(100 * time.Millisecond):
        // existing state check...
    }
}

Workaround

Manual cleanup function that SSHes into podman machine and kills stuck QEMU processes:
https://github.com/kaovilai/dotfiles/blob/main/zsh/functions/podman-utils.zsh#L63

Related

Note

Responses generated with Claude

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions