Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion Failure: Deepfrozen code object gets re-quickened #92031

Closed
sweeneyde opened this issue Apr 28, 2022 · 13 comments
Closed

Assertion Failure: Deepfrozen code object gets re-quickened #92031

sweeneyde opened this issue Apr 28, 2022 · 13 comments
Labels
type-bug An unexpected behavior, bug, or error

Comments

@sweeneyde
Copy link
Member

sweeneyde commented Apr 28, 2022

A third bug found while working on #91713 🙃

Bug report

I can reproduce this locally on my Windows machine:

.\python.bat -m test test_embed -m test_finalize_structseq -R1:50

This is the result:

PS C:\Users\sween\Source\Repos\cpython2\cpython> .\python.bat -m test test_embed -m test_finalize_structseq -R1:50
Running Debug|x64 interpreter...
WARNING: Running tests with --huntrleaks/-R and less than 3 warmup repetitions can give false positives!
0:00:00 Run tests sequentially
0:00:00 [1/1] test_embed
beginning 51 repetitions
123456789012345678901234567890123456789012345678901
...................test test_embed failed -- Traceback (most recent call last):
  File "C:\Users\sween\Source\Repos\cpython2\cpython\Lib\test\test_embed.py", line 343, in test_finalize_structseq
    out, err = self.run_embedded_interpreter("test_repeated_init_exec", code)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\sween\Source\Repos\cpython2\cpython\Lib\test\test_embed.py", line 117, in run_embedded_interpreter
    self.assertEqual(p.returncode, returncode,
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 3 != 0 : bad returncode 3, stderr is '--- Loop #1 ---\n..\n----------------------------------------------------------------------\nRan 2 tests in 0.001s\n\nOK\n--- Loop #2 ---\nA\x00s\x00s\x00e\x00r\x00t\x00i\x00o\x00n\x00 \x00f\x00a\x00i\x00l\x00e\x00d\x00:\x00 \x00i\x00n\x00s\x00t\x00r\x00u\x00c\x00t\x00i\x00o\x00n\x00s\x00[\x00i\x00 \x00+\x00 \x001\x00]\x00 \x00=\x00=\x00 \x000\x00,\x00 \x00f\x00i\x00l\x00e\x00 \x00C\x00:\x00\\\x00U\x00s\x00e\x00r\x00s\x00\\\x00s\x00w\x00e\x00e\x00n\x00\\\x00S\x00o\x00u\x00r\x00c\x00e\x00\\\x00R\x00e\x00p\x00o\x00s\x00\\\x00c\x00p\x00y\x00t\x00h\x00o\x00n\x002\x00\\\x00c\x00p\x00y\x00t\x00h\x00o\x00n\x00\\\x00P\x00y\x00t\x00h\x00o\x00n\x00\\\x00s\x00p\x00e\x00c\x00i\x00a\x00l\x00i\x00z\x00e\x00.\x00c\x00,\x00 \x00l\x00i\x00n\x00e\x00 \x002\x006\x006\x00\n\x00'

test_embed failed (1 failure)

== Tests result: FAILURE ==

1 test failed:
    test_embed

Total duration: 17.7 sec
Tests result: FAILURE

A more self-contained simplified (less unittest) reproducer is this:

from test import support

import os
import os.path
import subprocess
import sys

MS_WINDOWS = (os.name == 'nt')
MACOS = (sys.platform == 'darwin')

def debug_build(program):
    program = os.path.basename(program)
    name = os.path.splitext(program)[0]
    return name.casefold().endswith("_d".casefold())

exename = "_testembed"
builddir = os.path.dirname(sys.executable)
if MS_WINDOWS:
    ext = ("_d" if debug_build(sys.executable) else "") + ".exe"
    exename += ext
    exepath = builddir
    expecteddir = os.path.join(support.REPO_ROOT, builddir)
else:
    exepath = os.path.join(builddir, 'Programs')
    expecteddir = os.path.join(support.REPO_ROOT, 'Programs')

test_exe = os.path.join(exepath, exename)

###########################################################

os.chdir(support.REPO_ROOT)

code = """if 1:
    import unittest
    print("Tests passed")
"""

cmd = [test_exe, "test_repeated_init_exec", code]

for i in range(200):
    print(i)

    p = subprocess.Popen(cmd,
                         stdout=subprocess.PIPE,
                         stderr=subprocess.PIPE,
                         universal_newlines=True,
                         env=None,
                         cwd=None)
    (out, err) = p.communicate(input=input, timeout=None)

    print(out)
    if p.returncode != 0:
        print("bad return code", p.returncode)
        print(err)
        break

This fails with the following message:

A s s e r t i o n   f a i l e d :   i n s t r u c t i o n s [ i   +   1 ]   = =   0 ,   f i l e   C : \ U s e r s \ s w e e n \ S o u r c e \ R e p o s \ c p y t h o n 2 \ c p y t h o n \ P y t h o n \ s p e c i a l i z e . c ,   l i n e   2 6 6
@sweeneyde sweeneyde added the type-bug An unexpected behavior, bug, or error label Apr 28, 2022
@sweeneyde sweeneyde changed the title Assertion Failure in _PyCode_Quicken on Windows during test_embed.test_finalize_structseq Assertion Failure in _PyCode_Quicken during test_embed.test_finalize_structseq Apr 28, 2022
@sweeneyde
Copy link
Member Author

Replicated on linux (WSL).

@sweeneyde
Copy link
Member Author

test_repeated_init_exec is here: https://github.com/python/cpython/blob/main/Programs/_testembed.c#L167-L177

Replacing the assertion with if (instructions[i + 1]) Py_FatalError(_PyOpcode_OpName[opcode]); and running the above reproducer, I get this

Fatal Python error: _PyCode_Quicken: BINARY_OP
Python runtime state: initialized

Current thread 0x00006028 (most recent call first):
  File "<frozen importlib._bootstrap>", line 1231 in _handle_fromlist
  File "C:\Users\sween\Source\Repos\cpython2\cpython\Lib\inspect.py", line 153 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 939 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "C:\Users\sween\Source\Repos\cpython2\cpython\Lib\dataclasses.py", line 5 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 939 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "C:\Users\sween\Source\Repos\cpython2\cpython\Lib\pprint.py", line 38 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 939 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "C:\Users\sween\Source\Repos\cpython2\cpython\Lib\unittest\case.py", line 6 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 939 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "C:\Users\sween\Source\Repos\cpython2\cpython\Lib\unittest\__init__.py", line 61 in <module>
  File "<frozen importlib._bootstrap>", line 241 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 939 in exec_module
  File "<frozen importlib._bootstrap>", line 690 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 1149 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 1178 in _find_and_load
  File "<string>", line 1 in <module>

Sometimes the opcode in question changes, but it seems to always be in importlib._bootstrap._handle_fromlist.

I wonder if this has something to do with frozen modules keeping old code objects around between interpreter setup/teardown?

@ericsnowcurrently or @vstinner or @markshannon any ideas?

@sweeneyde
Copy link
Member Author

Solutions I could think of:

  1. Just remove the assertion.
  2. Enforce the assertion: set instructions[i + 1] = 0.
  3. Figure out how to make frozen modules get reloaded at each _testembed_Py_Initialize or somehow restore them to their unquickened state at each Py_Finalize();

@sweeneyde
Copy link
Member Author

This also makes me wonder if we should have some sort weekly buildbot run the tests with -R3:100 to check for this sort of thing.

@sweeneyde sweeneyde changed the title Assertion Failure in _PyCode_Quicken during test_embed.test_finalize_structseq Assertion Failure: Deepfrozen code object gets re-quickened Apr 29, 2022
@sweeneyde
Copy link
Member Author

I think this is the reason for some intermittent Windows buildbot failures.

@neonene
Copy link
Contributor

neonene commented May 1, 2022

It might be worth checking before the fix whether the reproducers go well with the buildbots.

@sweeneyde
Copy link
Member Author

@neonene Azure pipelines Ubuntu PR tests failed at #92130 on test_quickened_static_code_gets_unquickened_at_Py_FINALIZE, though the other CI checks mostly passed. It still seems rather flaky and probabilistic, though I believe that test makes the unfortunate scenario more likely.

It would be better to pin down a more deterministic (and faster-running) reproducer.

I'm not super familiar with the importlib implementation, but the key should just be to exercise _handle_fromlist (or other deepfrozen code) many times in a row until it gets quickened. It seemed like import unittest did that because unittest has a lot of dependencies, but there's probably a better way.

@neonene
Copy link
Contributor

neonene commented May 2, 2022

I saw the test at 45fbe93 also got a timeout error on Windows (x64).

@vstinner
Copy link
Member

vstinner commented May 2, 2022

Recent failure on s390x RHEL8 Refleaks 3.x (reformatted manually): https://buildbot.python.org/all/#/builders/75/builds/385

0:00:38 load avg: 4.48 [ 28/435/1] test_embed failed (1 failure) -- running: test_gdb (38.2 sec), test_tools (38.4 sec), test_zipfile (34.7 sec)
beginning 6 repetitions
123456
test test_embed failed -- Traceback (most recent call last):
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel8-z.refleak/build/Lib/test/test_embed.py", line 343, in test_finalize_structseq
    out, err = self.run_embedded_interpreter("test_repeated_init_exec", code)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel8-z.refleak/build/Lib/test/test_embed.py", line 117, in run_embedded_interpreter
    self.assertEqual(p.returncode, returncode,
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: -6 != 0 : bad returncode -6, stderr is "--- Loop #1 ---\n
..\n
----------------------------------------------------------------------\n
Ran 2 tests in 0.000s\n
\n
OK\n
--- Loop #2 ---\n
_testembed: Python/specialize.c:266: _PyCode_Quicken: Assertion `instructions[i + 1] == 0' failed.\n
"

@markshannon
Copy link
Member

./python -m test test_embed -m test_finalize_structseq -R1:50 fails for me on linux as well.

I think that restoring the code to its initial state during Py_Finalize() is probably the best solution.

@gvanrossum any thoughts?

@kumaraditya303
Copy link
Contributor

kumaraditya303 commented May 2, 2022

I think this should be handled in _Py_Deepfreeze_Fini as it currently finalizes other code object's fields.

@gvanrossum
Copy link
Member

@sweeneyde Can this be closed now that GH-92031 is fixed?

@vstinner
Copy link
Member

vstinner commented May 4, 2022

Fixed by b156578

sweeneyde added a commit to sweeneyde/cpython that referenced this issue May 8, 2022
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 11, 2022
…pythonGH-92440)

(cherry picked from commit 27185f9)

Co-authored-by: Dennis Sweeney <36520290+sweeneyde@users.noreply.github.com>
miss-islington added a commit that referenced this issue May 11, 2022
…2440)

(cherry picked from commit 27185f9)

Co-authored-by: Dennis Sweeney <36520290+sweeneyde@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

6 participants