Skip to content

state's status detection (based on PID lookups) is imprecise #871

@wking

Description

@wking

Since a few days ago, the runtime spec has a status property in the state (opencontainers/runtime-spec#462). I was curious about how that worked with detached containers, since there's no host-side monitor around to notice the container die. Testing with 1d61abe (the current tip of #827, although I expect this applies to all recent runC code), it looks like the state call works around the lack of a monitor (who could waitid(2) or some such on the container process) by looking up the container PID in /proc dynamically for each state call. That's racy though, since “has the same PID in /proc” doesn't mean “is the same process”. For example:

$ ./runc spec
$ JSON=$(jq 'del(.process.terminal)' config.json)
$ echo "${JSON}" >config.json
$ mkdir -p rootfs/bin
$ cp $(command -v busybox) rootfs/bin/sh
$ STATE=$(sudo ./runc state abc)
$ echo "${STATE}"
{
  "ociVersion": "0.6.0-dev",
  "id": "abc",
  "pid": 26373,
  "bundlePath": "/…/runc",
  "rootfsPath": "/…/runc/rootfs",
  "status": "created",
  "created": "2016-06-03T03:25:28.641226726Z"
}
$ PID=$(echo "${STATE}" | jq -r .pid)
$ sudo kill -9 "${PID}"
$ sudo ./runc state abc
{
  "ociVersion": "0.6.0-dev",
  "id": "abc",
  "pid": 26373,
  "bundlePath": "/…/runc",
  "rootfsPath": "/…/runc/rootfs",
  "status": "stopped",
  "created": "2016-06-03T03:25:28.641226726Z"
}

In a separate terminal, spawn processes until we match the old container PID:

$ while true; do sh -c 'if test $$ -eq 26373; then echo muahaha; sleep 60; fi'; done
muahaha
^C

And then runC thinks (wrongly) that the container is alive again:

$ sudo ./runc state abc
Password:
{
  "ociVersion": "0.6.0-dev",
  "id": "abc",
  "pid": 26373,
  "bundlePath": "/…/runc",
  "rootfsPath": "/…/runc/rootfs",
  "status": "running",
  "created": "2016-06-03T03:25:28.641226726Z"
}

One solution is to have the parent process (or an ancestor with PR_SET_CHILD_SUBREAPER after the parent dies) waitid(2) on the container process.

Another solution is to store more information about the container process (we probably already do) and compare that additional data when deciding whether the process running with the container-process PID was still the container process. For example, the PID and start-time (field 22 in /proc/[pid]/stat) form a tuple that is almost certainly unique.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions