-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
subprocess.Popen.send_signal() should poll the process #82811
Comments
subprocess.Popen.send_signal() doesn't check if the process exited since the poll() method has been called for the last time. If the process exit just before os.kill() is called, the signal can be sent to the wrong process if the process identified is recycled. Attached PR simply calls poll() once to reduce the time window when this race condition can occur, but it doesn't fully kill the race condition. -- See also the new "pidfd" API which only landed very recently in the Linux kernel to prevent this issue:
Illumos, OpenBSD, NetBSD and FreeBSD have similar concepts. I don't propose to use pidfd here, but it's just to highlight that it's a real issue and that kernels are evolving to provide more reliable solutions against the kill(pid, sig) race condition ;-) |
I'm not sure under which conditions this case happens. I understood that the pid is released (and so can be reused) as soon as the child process completes *and* waitpid() has been called. Two cases:
I'm mostly concerned by the multithreading case. I noticed the race condition while working on regrtest which uses 2 threads. One thread which is responsible for the process (call .communicate(), call .wait(), etc.), and one "main" thread which sends a signal to every child processes to interrupt them. My first implementation called .wait() after calling .send_signal() in the main thread. But it seems like Popen is not "fully" thread safe: the "waitpid lock" only protects the os.waitpid() call, it doesn't protect the self.returncode attribute. I modified regrtest to only call .waitpid() in the thread responsible of the process, to avoid such race condition. It seems like the new design is more reliable. The main thread only calls .send_signal(): it doesn't call .wait() (which calls os.waitpid()) anymore. |
-1 about the PR solution to suppress ProcessLookupError in case the process is gone. In psutil I solved the “pid reused problem” by using process creation time to identify a process uniquely (on start). |
Did someone propose a pull request to fix this issue by ignoring ProcessLookupError?
Well, that requires to write platform specific code. I would prefer to only use platform specific code when the platform provides something like pidfd: I understand that Linux, FreeBSD, OpenBSD, NetBSD, and Illumos are concerned. I mean reuse an exisiting solution, rather than writing our own "workaround" (solution?). |
By the way, I suggest to stick to the initial problem in this issue. If someone is intersted to explore the pidfd idea, I recommend to open a different issue ;-) |
I misread your PR, sorry. I thought that was the effect. |
It sounds like there's actually nothing to do here? (Except maybe eventually switch to pidfd or similar, but Victor says he wants to use a different issue for that.) Can this be closed? |
I opened this issue to propose PR 16984. Did you notice the PR? In short, I propose to call poll() before calling os.kill() in send_signal(). |
But then deeper in the thread it sounded like you concluded that this didn't actually help anything, which is what I would expect :-). |
The PR contains a test which demonstrate one race condition. It fails without the fix. |
pidfd is not available on all platforms (ex: Windows). |
You can't solve a time-of-check-to-time-of-use race by adding another check. I guess your patch might narrow the race window slightly, but it was a tiny window before and it's a tiny window after, so I don't really get it. The test doesn't demonstrate a race condition. What it demonstrates is increased robustness against user code that corrupts Popen's internal state by reaping one of its processes behind its back. That kind of robustness might be a good motivation on its own! (I actually just did a bunch of work to make trio more robust against user code closing fds behind its back, which is a similar kind of thing.) But it's a very different motivation than what you say in this bug. |
Hmm, you know, on further thought, it is actually possible to close this race for real, without pidfd. What you do is split 'wait' into two parts: first it waits for me process to become reapable without actually reaping it. On Linux you can do this with waitid+WNOWAIT. On kqueue platforms you can do it with kqueue. Then, you take a lock, and use it to atomically call waitpid/waitid to reap the process + update self.returncode to record that you've done so. In send_signal, we use the same lock to atomically check whether the process has been reaped, and then send the signal if it hasn't. That doesn't fix your test, but it does fix the race condition: as long as users only interact with the process through the Popen API, it guarantees that send_signal will never rather the wrong process. |
To further elaborate on the creation time solution, the idea in pseudo-code is the following: class Popen:
def __init__(self, ...):
self._execute_child(...)
try:
self._ctime = get_create_time(self.pid)
except ProcessLookupError:
self._ctime = None
def send_signal(self, sig):
if self._ctime is not None:
if self._ctime != get_create_time(self.pid):
raise ProecssLookupError("PID has been reused")
os.kill(self.pid, sig) Technically there is still a race condition between _execute_child() and get_create_time(), but:
~$ for i in $(seq 20); do ps; done | grep ps As for Windows, the termination is based on the process handle instead of the PID, so I assume that means Windows is safe in this regard. There was a prior discussion about this actually: |
Even if the approach seems to be different, PR 16984 seems to have the same advantages and drawbacks. PR 16984 rely on the fact that the Linux kernel doesn't use a pid until its parent calls os.waitpid(). Even if the child process completed, the pid remains assigned (and the process status is "zombie"). PR 16984 has the advantage that it reuses existing code which has been battle tested: we are sure that the code is portable and works well. Whereas get_create_time() will be likely platform specific and will add a little maintenance burden. I know that PR 16984 is a partial solution (there is still a race condition if waitpid() is called directly, without using the Popen API), but I don't see a strong reason to prefer get_create_time(). |
It's a good idea, but it would break the scenario where two threads call wait() concurrently. It would create this race condition:
What is needed here is a reader-writer lock. subprocess.wait would work like this (pseudocode): with lock taken for reading:
os.waitid(..., WNOHANG)
with lock taken for writing:
os.waitid(...)
self.returncode = ... Whereas subprocess.send_signal would work like this: with lock taken for reading:
os.kill(...) Unfortunately, it doesn't seem like Python has reader-writer locks in the library... |
I'm pretty sure you mean WNOWAIT, right? The actual reaping step (which might use WNOHANG) is already protected by a lock, but there's a danger that a process might try to perform the blocking-until-exit step (which uses WNOWAIT) on a stale pid. Good catch! I think this can be fixed by using a second lock around all of .wait(). But this is a bit tricky because we also need to account for concurrent .poll() calls. So something like: def wait(self):
# Don't take the big lock unless the process is actually still running
# This isn't just an optimization; it's critical for poll() correctness
if self.poll() is not None:
return self.returncode
with self._blocking_wait_lock:
with self._returncode_lock:
revalidate pid licenses
block_until_child_exits()
with self._returncode_lock:
reap the child and mark it as reaped
def poll(self):
# if someone is already blocked waiting for the child, then it definitely
# hasn't exited yet, so we don't need to call waitpid ourselves.
# This isn't just an optimization; it's critical for correctness.
if not self._blocking_wait_lock.acquire(blocking=False):
return None
try:
with self._returncode_lock:
do the actual poll + returncode updating
finally:
self._blocking_wait_lock.release() Notes: If there's already a wait() running, and someone calls poll(), then we have to make sure the poll() can't reap the process out from under the wait(). To fix that, poll skips trying in case wait is already running. But, this could introduce its own race condition: if a process has exited but we haven't noticed yet, and then someone calls wait() and poll() simultaneously, then the wait() might take the lock, then poll() notices the lock is taken, and concludes that the process can't have exited yet. If course wait() will immediately reap the process and drop the lock, but by this point poll() has already returned the wrong information. The problem is that poll() needs to be able to treat "the blocking wait lock is taken" as implying "the process hasn't exited yet". So to make that implication true, we add a check at the top of wait(). Of course if a process exits while wait() is running, and someone calls poll() in that brief interval between it exiting and wait() noticing, then poll() could again fail to report it exiting. But I think this is fine: it's ok if poll() is a fraction of a second out of date; that race condition is inherent in its API. The problem would be if the fails to notice a process that exited a while ago and is just sitting around waiting to be reaped. ... Ugh but there is still one more race condition here. I think this fixes all the cases involving send_signal, wait, and poll interactions with each other, BUT we broke the case of two threads calling poll() at the same time. One thread will take the blocking_wait_lock, then the other will take this as evidence that someone is blocked in wait() and exit early, which isn't appropriate. I think we can patch up this last race condition by adding yet one more lock: a simple with self._poll_lock: around the whole body of poll(), so that only one thread can call it at a time. |
Right.
What does this mean?
Wouldn't it be sufficient to add if self.returncode is not None:
return self.returncode at the top of poll()? (And in fact, you don't even need to add it, since an equivalent of this check is already there.) I think this makes calling poll() from wait() unnecessary too. |
It means autocorrect mangled the text... I don't remember what word that's supposed to be, but basically I meant "revalidate the pid hasn't been reaped"
No, the race condition is:
So adding a check at the top (= step 3) doesn't help. The problem is in steps 4/5. |
While there was no strong agreement on my change, Nathaniel and Giampaolo said that it should not harm to merge it :-) So I merged my change. It has been said multiple times: my change doesn't fully close the race condition. It only reduces the risk that it happens. I rephrased my change to mention that there is still room for the race condition and that polling only *reduces* the risk, it doesn't fully fix the race condition. -- Changing locking can help. But I'm not sure that it covers *all* cases. For example, what if another function calls os.waitpid() directly for whatever reason and the process completed? The pid recycling issue can also happen in this case, no? Linux pidfd really fix the issue. If someone wants to enhance subprocess to use the new os.pidfd_open(), I suggest to open a separated issue. Another approach is to emulate pidfd in userspace. Giampaolo explains that he also compares the process creation time for that (msg356571). I'm not excited to workaround OS issue this way. It requires to write platform specific get_create_time() code. I would prefer to reuse what the OS provides when available: Linux pidfd. -- This issue title is quite explicit on purpose: "subprocess.Popen.send_signal() should poll the process". I now close this issue: send_signal() does now poll :-) Because there is no *strong* consensus on my change, I decided to not backport it to stable branches 3.7 and 3.8. Thanks everyone for this very interesting discussion! |
This change caused a crash in the Duct library in Python 3.9. Duct uses the waitid(NOWAIT) strategy that Nathaniel Smith has described in this thread. With this change, the child process is reaped by kill() in a spot we didn't expect, and a subsequent call to os.waitid() raises a ChildProcessError. This is a race condition that only triggers if the child happens to exit before kill() is called. I just pushed oconnor663/duct.py@5dfae70 to work around this, which I'll release shortly as Duct version 0.6.4. Broadly speaking, this change could break any program that uses Popen.kill() together with os.waitpid() or os.waitid(). Checking Popen.returncode before calling the os functions is a good workaround for the single-threaded case, but it doesn't fix the multithreaded case. Duct is going to avoid calling Popen.kill() entirely. There are some longer comments about race conditions in that commit I linked. I don't feel strongly one way the other about keeping this new behavior, but we should probably document it clearly in Popen.send_signal() and all of its callers that these functions might reap the child. |
On Linux, a pidfd file descriptor can be created with os.pidfd_open() (added to Python 3.9). It would avoid even more race conditions. The best would be to request a pidfd file descriptor directly when we spawn the process which is possible on recent Linux kernel versions.
Feel free to propose a PR to document the behavior. I'm not sure of what you mean by send_signal() which "reaps" the child process. See also bpo-40550. Anyway, this issue and bpo-40550 are closed. I suggest to open a new issue to enhance the documentation. |
Filed separately as https://bugs.python.org/issue42558. |
Filed here #86724 |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: