Skip to content

Conversation

sweeneyde
Copy link
Member

@sweeneyde sweeneyde commented Oct 27, 2021

@sweeneyde
Copy link
Member Author

Microbenchmarks are good, pyperformance has a couple of the expected speedups, especially nobody (1.21x faster!), some slowdowns that I'm guessing are just my unstable benchmarking. https://gist.github.com/sweeneyde/a3bef64bf07560c90dd4bdad820b4a07

@sweeneyde sweeneyde marked this pull request as ready for review October 27, 2021 21:11
@sweeneyde sweeneyde added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 27, 2021
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @sweeneyde for commit 3245f21 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 27, 2021
@sweeneyde
Copy link
Member Author

@markshannon
Copy link
Member

markshannon commented Oct 28, 2021

Nice speedups for nbody and the microbenchmark results look promising.

I note that you have added a lot of code just for specializing STORE_SUBSCR for dicts.
How much difference does it make in terms of performance compared to a simple specialization for dict that uses PyDict_SetItem?

@markshannon markshannon self-assigned this Oct 28, 2021
@sweeneyde
Copy link
Member Author

I just got these microbenchmarks for the difference between dict[object] versus dict[str]+dict[general]:

Benchmark one_dict_opcode two_dict_opcodes
dict[str]=... 15.8 ms 14.0 ms: 1.12x faster
dict[tuple]=... 21.7 ms 21.0 ms: 1.03x faster
dict[int]=... 18.0 ms 17.4 ms: 1.03x faster
Geometric mean (ref) 1.04x faster

Benchmark hidden because not significant (2): bytearray[int]=..., list[int]=...

@markshannon
Copy link
Member

What about for the standard benchmark set?
Microbenchmarks like those can't be trusted because they have such unrealistic icache effects. A 4% improvement for dict[...] = ... only needs to cause a fraction of a percent slowdown on other code to result in an overall slow down.

@sweeneyde
Copy link
Member Author

Most of pyperformance doesn't really make use of dicts, other than django_template, where 99% of dict lookups there are looking up strings that already have their hash computed (see faster-cpython/ideas#105). Running just that one benchmark, I got

Benchmark django_one_dict_opcode django_two_dict_opcodes
django_template 46.0 ms 45.5 ms: 1.01x faster

@sweeneyde
Copy link
Member Author

Pyperformance under-utilizes dicts IMO, but I think dict[str] is common enough in real code that it might make sense to specialize for that extra 12% microbenchmark.

But if your suggestion is to keep the complexity at bay and only have one dict[object] specialization opcode, I can do that as well.

@markshannon
Copy link
Member

But if your suggestion is to keep the complexity at bay and only have one dict[object] specialization opcode, I can do that as well.

I would rather have just the one specialization for dict.
Keeping the code small is good, although it has less to do with abstract concepts like complexity, and more to do with concrete things like icache usage.

I realize this is just generalizing and I don't have data to back it up. Hopefully we'll start gathering hardware metrics along with timings in the future which will help us make better informed decisions on such things.

@markshannon
Copy link
Member

Thanks, looks good. I'll benchmark it later.

One thing from your earlier version that is probably worth doing in a future PR is the reference stealing. Something like 5ccb398#diff-b08a47ddc5bc20b2e99ac2e5aa199ca24a56b994e7bc64e918513356088c20aeR1633
Let's call it _PyDict_SetItem

PyDict_SetItem can then wrap that function, to avoid code duplication:

int
PyDict_SetItem(PyObject *op, PyObject *key, PyObject *value)
{
    if (!PyDict_Check(op)) {
        PyErr_BadInternalCall();
        return -1;
    }
    Py_INCREF(key);
    Py_INCREF(value);
    return _PyDict_SetItem((PyDictObject *)op, key, value);
}

and STORE_SUBSCR_DICT can call _PyDict_SetItem directly.

It shouldn't be that much more code if we make insertdict consume the references as well, as it will just be a case of moving the INCREF/DECREFs about.

@sweeneyde
Copy link
Member Author

Ah I missed "in a future PR", but I don't think this is a huge change.

Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you've added _PyDict_SetItem[_Take[s_2]] you might as well use it elsewhere.
MAP_ADD is the obvious candidate.

int
PyDict_SetItem(PyObject *op, PyObject *key, PyObject *value)
_PyDict_SetItem(PyObject *op, PyObject *key, PyObject *value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make the first argument PyDictObject *mp. It has to be a dict, so we might as well tell the C compiler that.

@@ -22,6 +22,8 @@ typedef struct {
*/
Py_ssize_t _Py_dict_lookup(PyDictObject *mp, PyObject *key, Py_hash_t hash, PyObject **value_addr);

/* Steals key and value */
int _PyDict_SetItem(PyObject *op, PyObject *key, PyObject *value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that I suggested _PyDict_SetItem but can we make it clearer that it consumes references to key and values.

BTW I don't like the term "steal". That suggests that the caller is unaware that the references are taken, but it has to know that. I prefer "consume", or "take". Rust uses "take", IIRC.

So how about _PyDict_SetItem_Take, or even _PyDict_SetItem_Takes2 to show that two references are taken.

@markshannon markshannon added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 10, 2021
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @markshannon for commit 9c6860d 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 10, 2021
@sweeneyde
Copy link
Member Author

The one failing "buildbot/AMD64 Arch Linux Asan Debug PR" is on test_gdb:

Click to exapnd

0:53:59 load avg: 1.76 Re-running failed tests in verbose mode
0:53:59 load avg: 1.76 Re-running test_gdb in verbose mode (matching: test_bt, test_bt_full, test_threads, test_pyup_command, test_up_then_down)
test_bt (test.test_gdb.PyBtTests)
Stderr:
warning: Error disabling address space randomization: Operation not permitted
Python Exception <class 'AttributeError'>: 'NoneType' object has no attribute 'previous'
Error occurred in Python: 'NoneType' object has no attribute 'previous'
Verify that the "py-bt" command works ... FAIL
test_bt_full (test.test_gdb.PyBtTests)
Stderr:
warning: Error disabling address space randomization: Operation not permitted
Python Exception <class 'AttributeError'>: 'NoneType' object has no attribute 'previous'
Error occurred in Python: 'NoneType' object has no attribute 'previous'
Verify that the "py-bt-full" command works ... FAIL
test_threads (test.test_gdb.PyBtTests)
Stderr:
warning: Error disabling address space randomization: Operation not permitted
Error occurred in Python: 'NoneType' object has no attribute 'previous'
Verify that "py-bt" indicates threads that are waiting for the GIL ... FAIL
test_pyup_command (test.test_gdb.StackNavigationTests)
Stderr:
warning: Error disabling address space randomization: Operation not permitted
Python Exception <class 'AttributeError'>: 'NoneType' object has no attribute 'previous'
Error occurred in Python: 'NoneType' object has no attribute 'previous'
Verify that the "py-up" command works ... FAIL
test_up_then_down (test.test_gdb.StackNavigationTests)
Stderr:
warning: Error disabling address space randomization: Operation not permitted
Python Exception <class 'AttributeError'>: 'NoneType' object has no attribute 'previous'
Error occurred in Python: 'NoneType' object has no attribute 'previous'
Python Exception <class 'AttributeError'>: 'NoneType' object has no attribute 'previous'
Error occurred in Python: 'NoneType' object has no attribute 'previous'
Verify "py-up" followed by "py-down" ... FAIL
======================================================================
FAIL: test_bt (test.test_gdb.PyBtTests)
Verify that the "py-bt" command works
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/buildbot/buildarea/pull_request.pablogsal-arch-x86_64.asan_debug/build/Lib/test/test_gdb.py", line 796, in test_bt
    self.assertMultilineMatches(bt,
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/buildbot/buildarea/pull_request.pablogsal-arch-x86_64.asan_debug/build/Lib/test/test_gdb.py", line 297, in assertMultilineMatches
    self.fail(msg='%r did not match %r' % (actual, pattern))
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 'Breakpoint 1 at 0x813400: file Python/bltinmodule.c, line 1196.\n[Thread debugging using libthread_db enabled]\nUsing host libthread_db library "/usr/lib/libthread_db.so.1".\n\nBreakpoint 1, builtin_id (self=, v=42) at Python/bltinmodule.c:1196\n1196\t{\nTraceback (most recent call first):\n  <built-in method id of module object at remote 0x608000071840>\n  (unable to read python frame information)\n' did not match '^.*\nTraceback \\(most recent call first\\):\n  <built-in method id of module object .*>\n  File ".*gdb_sample.py", line 10, in baz\n    id\\(42\\)\n  File ".*gdb_sample.py", line 7, in bar\n    baz\\(a, b, c\\)\n  File ".*gdb_sample.py", line 4, in foo\n    bar\\(a=a, b=b, c=c\\)\n  File ".*gdb_sample.py", line 12, in <module>\n    foo\\(1, 2, 3\\)\n'
Stderr:
warning: Error disabling address space randomization: Operation not permitted
Python Exception <class 'AttributeError'>: 'NoneType' object has no attribute 'previous'
Error occurred in Python: 'NoneType' object has no attribute 'previous'
======================================================================
FAIL: test_bt_full (test.test_gdb.PyBtTests)
Verify that the "py-bt-full" command works
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/buildbot/buildarea/pull_request.pablogsal-arch-x86_64.asan_debug/build/Lib/test/test_gdb.py", line 816, in test_bt_full
    self.assertMultilineMatches(bt,
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/buildbot/buildarea/pull_request.pablogsal-arch-x86_64.asan_debug/build/Lib/test/test_gdb.py", line 297, in assertMultilineMatches
    self.fail(msg='%r did not match %r' % (actual, pattern))
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 'Breakpoint 1 at 0x813400: file Python/bltinmodule.c, line 1196.\n[Thread debugging using libthread_db enabled]\nUsing host libthread_db library "/usr/lib/libthread_db.so.1".\n\nBreakpoint 1, builtin_id (self=, v=42) at Python/bltinmodule.c:1196\n1196\t{\n#1 <built-in method id of module object at remote 0x608000071840>\n#4 (unable to read python frame information)\n' did not match '^.*\n#[0-9]+ Frame 0x-?[0-9a-f]+, for file .*gdb_sample.py, line 7, in bar \\(a=1, b=2, c=3\\)\n    baz\\(a, b, c\\)\n#[0-9]+ Frame 0x-?[0-9a-f]+, for file .*gdb_sample.py, line 4, in foo \\(a=1, b=2, c=3\\)\n    bar\\(a=a, b=b, c=c\\)\n#[0-9]+ Frame 0x-?[0-9a-f]+, for file .*gdb_sample.py, line 12, in <module> \\(\\)\n    foo\\(1, 2, 3\\)\n'
Stderr:
warning: Error disabling address space randomization: Operation not permitted
Python Exception <class 'AttributeError'>: 'NoneType' object has no attribute 'previous'
Error occurred in Python: 'NoneType' object has no attribute 'previous'
======================================================================
FAIL: test_threads (test.test_gdb.PyBtTests)
Verify that "py-bt" indicates threads that are waiting for the GIL
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/buildbot/buildarea/pull_request.pablogsal-arch-x86_64.asan_debug/build/Lib/test/test_gdb.py", line 853, in test_threads
    self.assertIn('Waiting for the GIL', gdb_output)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 'Waiting for the GIL' not found in 'Breakpoint 1 at 0x813400: file Python/bltinmodule.c, line 1196.\n[Thread debugging using libthread_db enabled]\nUsing host libthread_db library "/usr/lib/libthread_db.so.1".\n[New Thread 0x7fa4c6189640 (LWP 2729264)]\n[New Thread 0x7fa4c5972640 (LWP 2729265)]\n[New Thread 0x7fa4c515b640 (LWP 2729266)]\n[New Thread 0x7fa4c4944640 (LWP 2729267)]\n\nThread 1 "python" hit Breakpoint 1, builtin_id (self=, v=42) at Python/bltinmodule.c:1196\n1196\t{\n\nThread 5 (Thread 0x7fa4c4944640 (LWP 2729267) "python"):\n'
Stderr:
warning: Error disabling address space randomization: Operation not permitted
Error occurred in Python: 'NoneType' object has no attribute 'previous'
======================================================================
FAIL: test_pyup_command (test.test_gdb.StackNavigationTests)
Verify that the "py-up" command works
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/buildbot/buildarea/pull_request.pablogsal-arch-x86_64.asan_debug/build/Lib/test/test_gdb.py", line 753, in test_pyup_command
    self.assertMultilineMatches(bt,
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/buildbot/buildarea/pull_request.pablogsal-arch-x86_64.asan_debug/build/Lib/test/test_gdb.py", line 297, in assertMultilineMatches
    self.fail(msg='%r did not match %r' % (actual, pattern))
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 'Breakpoint 1 at 0x813400: file Python/bltinmodule.c, line 1196.\n[Thread debugging using libthread_db enabled]\nUsing host libthread_db library "/usr/lib/libthread_db.so.1".\n\nBreakpoint 1, builtin_id (self=, v=42) at Python/bltinmodule.c:1196\n1196\t{\n#4 (unable to read python frame information)\n#12 <built-in method pyobject_fastcall of module object at remote 0x6080001184c0>\n' did not match '^.*\n#[0-9]+ Frame 0x-?[0-9a-f]+, for file <string>, line 12, in baz \\(args=\\(1, 2, 3\\)\\)\n#[0-9]+ <built-in method pyobject_fastcall of module object at remote 0x[0-9a-f]+>\n$'
Stderr:
warning: Error disabling address space randomization: Operation not permitted
Python Exception <class 'AttributeError'>: 'NoneType' object has no attribute 'previous'
Error occurred in Python: 'NoneType' object has no attribute 'previous'
======================================================================
FAIL: test_up_then_down (test.test_gdb.StackNavigationTests)
Verify "py-up" followed by "py-down"
----------------------------------------------------------------------
test test_gdb failed
Traceback (most recent call last):
  File "/buildbot/buildarea/pull_request.pablogsal-arch-x86_64.asan_debug/build/Lib/test/test_gdb.py", line 782, in test_up_then_down
    self.assertMultilineMatches(bt,
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/buildbot/buildarea/pull_request.pablogsal-arch-x86_64.asan_debug/build/Lib/test/test_gdb.py", line 297, in assertMultilineMatches
    self.fail(msg='%r did not match %r' % (actual, pattern))
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 'Breakpoint 1 at 0x813400: file Python/bltinmodule.c, line 1196.\n[Thread debugging using libthread_db enabled]\nUsing host libthread_db library "/usr/lib/libthread_db.so.1".\n\nBreakpoint 1, builtin_id (self=, v=42) at Python/bltinmodule.c:1196\n1196\t{\n#4 (unable to read python frame information)\n#12 <built-in method pyobject_fastcall of module object at remote 0x6080001184c0>\n#4 (unable to read python frame information)\n' did not match '^.*\n#[0-9]+ Frame 0x-?[0-9a-f]+, for file <string>, line 12, in baz \\(args=\\(1, 2, 3\\)\\)\n#[0-9]+ <built-in method pyobject_fastcall of module object at remote 0x[0-9a-f]+>\n#[0-9]+ Frame 0x-?[0-9a-f]+, for file <string>, line 12, in baz \\(args=\\(1, 2, 3\\)\\)\n$'
Stderr:
warning: Error disabling address space randomization: Operation not permitted
Python Exception <class 'AttributeError'>: 'NoneType' object has no attribute 'previous'
Error occurred in Python: 'NoneType' object has no attribute 'previous'
Python Exception <class 'AttributeError'>: 'NoneType' object has no attribute 'previous'
Error occurred in Python: 'NoneType' object has no attribute 'previous'

@markshannon
Copy link
Member

markshannon commented Nov 12, 2021

The buildbot failure should have been fixed by #29515

@sweeneyde sweeneyde added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 16, 2021
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @sweeneyde for commit 9f9ef35 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 16, 2021
@sweeneyde
Copy link
Member Author

sweeneyde commented Nov 17, 2021

These were the latest failures... they seem unrelated to me.

(1) AMD64 Arch Linux Asan Debug PR:
    in test_gdb:
        Error disabling address space randomization: Operation not permitted
        Verify that the "py-bt" command works ... FAIL
        Verify that the "py-bt-full" command works ... FAIL
        Verify that "py-bt" indicates threads that are waiting for the GIL ... FAIL
        Verify that the "py-up" command works ... FAIL
        Verify "py-up" followed by "py-down" ... FAIL

(2) AMD64 Windows10 PR:
(3) AMD64 Windows10 Pro PR:
(4) ARM64 Windows Non-Debug PR:
(5) ARM64 Windows PR:
    (See https://bugs.python.org/issue45078)
    FAIL: test_read_bytes (test.test_importlib.test_files.OpenNamespaceTests)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "D:\buildarea\pull_request.bolen-windows10\build\lib\test    \test_importlib\test_files.py", line 14, in test_read_bytes
        assert actual == b'Hello, UTF-8 world!\n'
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    AssertionError

(6) ARM64 macOS PR:
    (See https://bugs.python.org/issue44359)
    test_ftplib failed (env changed)
    ssl.SSLZeroReturnError: TLS/SSL connection has been closed (EOF) (_ssl.c:998)

(7) PPC64LE RHEL8 LTO PR:
    test_multiprocessing_spawn failed (env changed)
    Warning -- multiprocessing.Manager still has [<SpawnProcess name='SyncManager-140' pid=4073709 parent=4066984 started>] active children after 5.112984410952777 seconds

    Traceback (most recent call last):
      File "/home/buildbot/buildarea/pull_request.cstratak-RHEL8-ppc64le.lto/build/Lib/multiprocessing/resource_tracker.py", line 209, in main
        cache[rtype].remove(name)
        ^^^^^^^^^^^^^^^^^^^^^^^^^
    KeyError: '/psm_6aa8d9ef'

@markshannon
Copy link
Member

markshannon commented Nov 18, 2021

Results are good

I'll be happy to merge this, once the conflicts are fixed.

@markshannon markshannon merged commit 036fead into python:main Nov 19, 2021
remykarem pushed a commit to remykarem/cpython that referenced this pull request Dec 7, 2021
* Specialize STORE_SUBSCR for list[int], and dict[object]

* Adds _PyDict_SetItem_Take2 which consumes references to the key and values.
@sweeneyde sweeneyde deleted the store_subscr2 branch December 19, 2021 05:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants