Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using internal tokenize module's TokenizerIter in multiple threads crashes #120317

Closed
lysnikolaou opened this issue Jun 10, 2024 · 0 comments
Closed
Assignees
Labels
3.13 bugs and security fixes 3.14 new features, bugs and security fixes topic-free-threading type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@lysnikolaou
Copy link
Contributor

lysnikolaou commented Jun 10, 2024

Crash report

What happened?

Because the tokenizer is not thread-safe, using the same TokenizerIter in multiple threads under the free-threaded build leads to all kinds of unpredicted behavior. It sometimes succeeds, sometimes throws a SyntaxError when there's none and sometimes crashes with the following.

Example error backtrace
Fatal Python error: tok_backup: tok_backup: wrong character
Python runtime state: initialized

Current thread 0x0000000172e1b000 (most recent call first):
  File "/Users/lysnikolaou/repos/python/cpython/tmp/t1.py", line 9 in next_token
  File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.py", line 58 in run
  File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.py", line 92 in _worker
  File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 990 in run
  File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 1039 in _bootstrap_inner
  File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 1010 in _bootstrap

Thread 0x0000000171e0f000 (most recent call first):
  File "/Users/lysnikolaou/repos/python/cpython/tmp/t1.py", line 10 in next_token
  File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.py", line 58 in run
  File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.py", line 92 in _worker
  File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 990 in run
  File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 1039 in _bootstrap_inner
  File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 1010 in _bootstrap

Thread 0x0000000170e03000 (most recent call first):
  File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/_base.py", line 550 in set_exception
  File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.py", line 60 in run
  File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.py", line 92 in _worker
  File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 990 in run
  File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 1039 in _bootstrap_inner
  File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 1010 in _bootstrap

Thread 0x000000016fdf7000 (most recent call first):
  File "/Users/lysnikolaou/repos/python/cpython/tmp/t1.py", line 10 in next_token
  File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.py", line 58 in run
  File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.py", line 92 in Assertion failed: (tok->done != E_ERROR), function _syntaxerror__workerrange, file helpers.c, line 17.

  File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 990 in run
  File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 1039 in _bootstrap_inner
  File "/Users/lysnikolaou/repos/python/cpython/Lib/threading.py", line 1010 in _bootstrap

Thread 0x000000016edeb000 (most recent call first):
  File "/Users/lysnikolaou/repos/python/cpython/tmp/t1.py", line 10 in next_token
  File "/Users/lysnikolaou/repos/python/cpython/Lib/concurrent/futures/thread.pyzsh: abort      ./python.exe tmp/t1.py

A minimal reproducer is the following:

import concurrent.futures
import io
import time
import tokenize

def next_token(it):
    while True:
        try:
            r = next(it)
            print(tokenize.TokenInfo._make(r))
            time.sleep(1)
        except StopIteration:
            return


for _ in range(20):
    with concurrent.futures.ThreadPoolExecutor() as executor:
        source = io.StringIO("a = 'abc'\nprint(b)\nfor _ in a:  do_something()")
        it = tokenize._tokenize.TokenizerIter(source.readline, extra_tokens=False)
        threads = (executor.submit(next_token, it) for _ in range(5))
        for t in concurrent.futures.as_completed(threads):
            t.result()
        print("######################################################")

CPython versions tested on:

CPython main branch

Operating systems tested on:

macOS

Output from running 'python -VV' on the command line:

Python 3.14.0a0 experimental free-threading build (heads/main:c3b6dbff2c8, Jun 10 2024, 14:33:07) [Clang 15.0.0 (clang-1500.3.9.4)]

Linked PRs

@lysnikolaou lysnikolaou added type-crash A hard crash of the interpreter, possibly with a core dump 3.13 bugs and security fixes topic-free-threading 3.14 new features, bugs and security fixes labels Jun 10, 2024
@lysnikolaou lysnikolaou self-assigned this Jun 10, 2024
lysnikolaou added a commit that referenced this issue Jul 16, 2024
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jul 16, 2024
…honGH-120318)

(cherry picked from commit 8549559)

Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
lysnikolaou added a commit that referenced this issue Jul 16, 2024
…-120318) (#121841)

(cherry picked from commit 8549559)

Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes 3.14 new features, bugs and security fixes topic-free-threading type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

1 participant