-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-33234 Improve list() pre-sizing for inputs with known lengths (no __length_hint__) #9846
Conversation
The list() constructor isn't taking full advantage of known input lengths or length hints. This commit makes the constructor pre-size and not over-allocate when the input size is known or can be reasonably estimated. A new test is added to test_list.py and a test needed to be modified in test_descr.py as it assumed that there is only one call to __length_hint__() in the list constructor.
…ce for small iterables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but the NEWS entry should be fixed.
@@ -0,0 +1,2 @@ | |||
The list constructor will pre-size and not over-allocate when the input size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
length, not size.
@@ -0,0 +1,2 @@ | |||
The list constructor will pre-size and not over-allocate when the input size | |||
is known or can be reasonably estimated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"or can be reasonably estimated" I don't understand this part.
Objects/listobject.c
Outdated
@@ -2649,6 +2675,13 @@ list___init___impl(PyListObject *self, PyObject *iterable) | |||
(void)_list_clear(self); | |||
} | |||
if (iterable != NULL) { | |||
Py_ssize_t iter_len = PyObject_Length(iterable); | |||
if (iter_len == -1){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: ...) { (add a space)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I suggest to wait the approval of a second core dev before merging this change. |
I would like @serhiy-storchaka comment and advice on this if possible. |
@pablogsal: If you want me to merge your PR, I would be me confident if you write a benchmark for the worst case: a small container (ex: 10 items or less) without a known length. I expect that the additional function call (obj.len()) would slow down this case. I would be interested to see the overhead. |
Experiment: import perf
runner = perf.Runner()
runner.timeit("list(x)",
stmt="list(x)",
setup="x = iter(list(range(10)))") result:
With the experiment: import perf
runner = perf.Runner()
runner.timeit("list(x)",
stmt="list(x)",
setup="x = [0]*10000") the result is
I would say that this result points to abandon this version of the change, given the bad performance for small iterables. |
This is the first experiment using
|
This overhead likely comes from the raisin an exception + PyErr_Clear(). hasattr() has been optimized by adding a _PyObject_LookupAttr() which is getattr() without raising an exception if the attribute doesn't exist. Probing an object size is a common need for optimization to preallocate a container. Python builtin types would benefit of such function. What do you think of adding something like _PyObject_ProbeLength() which is like PyObject_Length() without raising an exception? |
Objects/listobject.c
Outdated
@@ -2649,6 +2675,13 @@ list___init___impl(PyListObject *self, PyObject *iterable) | |||
(void)_list_clear(self); | |||
} | |||
if (iterable != NULL) { | |||
Py_ssize_t iter_len = PyObject_Length(iterable); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reduce or avoid the overhead of the worst case, you can avoid this call if _PyObject_HasLen() is false.
This change doesn't require to implement a new _PyObject_ProbeLength().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated run of first experiment using _PyObject_HasLen()
:
Mean +- std dev: [old] 795 ns +- 25 ns -> [new] 776 ns +- 25 ns: 1.02x faster (-6%)
Updated run of first experiment using
|
If it's the "x = iter(list(range(10)))" worst case: it confirms what I expected, _PyObject_HasLen() makes the overhead on the worst case not significant. Good. I guess that you kept the same speedup for the best case? |
Objects/listobject.c
Outdated
@@ -2649,6 +2675,15 @@ list___init___impl(PyListObject *self, PyObject *iterable) | |||
(void)_list_clear(self); | |||
} | |||
if (iterable != NULL) { | |||
if(_PyObject_HasLen(iterable)){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a space before {
.
Objects/listobject.c
Outdated
@@ -2649,6 +2675,15 @@ list___init___impl(PyListObject *self, PyObject *iterable) | |||
(void)_list_clear(self); | |||
} | |||
if (iterable != NULL) { | |||
if(_PyObject_HasLen(iterable)){ | |||
Py_ssize_t iter_len = PyObject_Length(iterable); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyObject_Length
is an alias of PyObject_Size
(for backward compatibility). It is better to use PyObject_Size
.
Objects/listobject.c
Outdated
allocated = 0; | ||
} | ||
num_allocated_bytes = allocated * sizeof(PyObject *); | ||
items = (PyObject **)PyMem_Malloc(num_allocated_bytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would not be easier to use PyMem_New
?
Performance metrics with commit b678442:
and perf information:
Also, let's not forget the original objective of this patch: PATCH>>> import sys
>>> x = list([0]*1000000)
>>>
KeyboardInterrupt
>>> sys.getsizeof(x)
8000080 MASTER>>> import sys
>>> x = list([0]*1000000)
>>>
KeyboardInterrupt
>>> sys.getsizeof(x)
9000120 This is a 12 % improvement in memory claimed by the list in these cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, it's now even better with _PyObject_HasLen()!
Objects/listobject.c
Outdated
if (iter_len == -1) { | ||
if (PyErr_ExceptionMatches(PyExc_Exception)) { | ||
PyErr_Clear(); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: PEP 7 requires
}
else {
Objects/listobject.c
Outdated
if (_PyObject_HasLen(iterable) && self->ob_item == NULL) { | ||
Py_ssize_t iter_len = PyObject_Size(iterable); | ||
if (iter_len == -1) { | ||
if (PyErr_ExceptionMatches(PyExc_Exception)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What tests will fail if not silence any exceptions at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least the following:
test_generators failed
**********************************************************************
File "/home/pablogsal/github/cpython/Lib/test/test_generators.py", line ?, in test.test_generators.__test__.weakref
Failed example:
list(p)
Exception raised:
Traceback (most recent call last):
File "/home/pablogsal/github/cpython/Lib/doctest.py", line 1329, in __run
compileflags, 1), test.globs)
File "<doctest test.test_generators.__test__.weakref[9]>", line 1, in <module>
list(p)
TypeError: object of type 'generator' has no len()
**********************************************************************
1 items had failures:
1 of 10 in test.test_generators.__test__.weakref
***Test Failed*** 1 failures.
test_genexps failed
**********************************************************************
File "/home/pablogsal/github/cpython/Lib/test/test_genexps.py", line ?, in test.test_genexps.__test__.doctests
Failed example:
list(p)
Exception raised:
Traceback (most recent call last):
File "/home/pablogsal/github/cpython/Lib/doctest.py", line 1329, in __run
compileflags, 1), test.globs)
File "<doctest test.test_genexps.__test__.doctests[75]>", line 1, in <module>
list(p)
TypeError: object of type 'generator' has no len()
**********************************************************************
1 items had failures:
1 of 76 in test.test_genexps.__test__.doctests
***Test Failed*** 1 failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can _PyObject_HasLen()
return true for generators?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it is not true generator, it is a weakref. What if silence just TypeError?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit afraid if someone has similar tests/usage like these two cases and then they will receive new errors as raising all exceptions would be technically a backwards incompatible change. But if you think we should raise (maybe ignoring TypeError
) I am happy to change the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See PyObject_LengthHint()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 0a6c8ba
Objects/listobject.c
Outdated
if (PyErr_ExceptionMatches(PyExc_Exception)) { | ||
PyErr_Clear(); | ||
} | ||
else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: for readability, I suggest to invert the if condition and exchange the true/false blocks, and move PyErr_Clear() after the if:
if (!not exception) { return -1; }
PyErr_Clear();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
if (!PyErr_ExceptionMatches(PyExc_TypeError)) { | ||
return -1; | ||
} | ||
PyErr_Clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pablogsal, @serhiy-storchaka: In a previous comment, I proposed to add an helper function to "probe" an object size: so move this code into a private helper function. Since the same code is used by PyObject_LengthHint(), it would now make sense, no?
See also iter_len() ... which is different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code will be used just in two places and is not too complex. Adding yet one intermediate function will add a performance penalty for calling a function and checking its result, and will complicate the code. If this code will be used in more places, it can be refactored.
This is a version of #6493 that only preallocates the list when
__len__
is available to reduce memory overallocation and possible performance hit on big iterables.This are the L1 and L2 cache misses and related information:
THIS PR
MASTER
https://bugs.python.org/issue33234