Description
Python allows multiple threads to concurrently access file descriptors through files and sockets. Race conditions at the Python level can lead to unexpected behavior, even with the global interpreter lock.
Thread sanitizer reports these races in some of our tests that exercise this behavior. We cannot fix these potential race conditions without introducing potential deadlocks.
For example, consider:
import threading
secrets = open("secrets.txt", "w");
def thread1():
secrets.write("a secret")
def thread2():
secrets.close()
def thread3():
with open("log.txt", "w") as log:
log.write("log message")
threading.Thread(target=thread1).start()
threading.Thread(target=thread2).start()
threading.Thread(target=thread3).start()
If you are particularly unlucky, then the file descriptor for secrets
may be closed by thread2
and reused as the file descriptor for log
just before thread1
writes to it. In other words, thread1
may write "a secret" to a completely different file or socket than it intended.
This can happen, even with the GIL, because the GIL is released around write()
and close()
. In other words, the following can happen:
- The
secrets.write()
call releases the GIL, but before it actually calls the Cwrite()
function on the file descriptor.... - The
secrets.close()
call closes the file descriptor - The
open("log.txt", "w")
re-uses the same file descriptor number - The
secrets.write()
call nowwrite()
on the wrong file descriptor
Note that we must release the GIL before calling write()
or close()
because these functions potentially block.