Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-98363: Presize the list for batched() #98419

Merged
merged 6 commits into from
Oct 20, 2022

Conversation

rhettinger
Copy link
Contributor

@rhettinger rhettinger commented Oct 18, 2022

Presizing the result lists saves space from list over-allocation and is faster than the PyList_Append() approach.

This is a short PR but it took a while to figure out that PyList_GetSlice() would make short work of the problem. Ideally, we need something like _PyTuple_Resize() to resize the list inplace using a realloc() and updating the allocated field.

@rhettinger rhettinger added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 19, 2022
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @rhettinger for commit fd5b508 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 19, 2022
if (result == NULL) {
return NULL;
}
for (i=0 ; i < bo->batch_size ; i++) {
for (i=0 ; i < n ; i++) {
item = PyIter_Next(it);
if (item == NULL) {
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care how this handles strange iterators that keep going after having raised StopIteration?

>>> from itertools import batched
>>> class A:
...     def __init__(self):
...         self.i = 0
...     def __iter__(self):
...         return self
...     def __next__(self):
...         i = self.i
...         self.i += 1
...         if i == 7:
...             raise StopIteration()
...         return i
...
>>> x = batched(A(), 5)
>>> next(x)
[0, 1, 2, 3, 4]
>>> next(x)
[5, 6]
>>> next(x)
[8, 9, 10, 11, 12]

Maybe the Py_CLEAR(bo->it) could be moved to before this break;.

Copy link
Contributor Author

@rhettinger rhettinger Oct 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of leaving this as is. The current PR matches the behavior of the pure python version and the strange iterator works the same way with other tools:

>>> it = zip(range(10), A())
>>> list(it)
[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6)]
>>> list(it)
[(8, 8), (9, 9)]

>>> it = enumerate(A())
>>> list(it)
[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6)]
>>> next(it)
(7, 8)
>>> next(it)
(8, 9)

That said, I'm not really sure about this one and could easily be persuaded to add a Py_CLEAR(bo->it) to the code path for the short list.

Copy link
Member

@sweeneyde sweeneyde Oct 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does seem like an edge case that could go either way.

I'm noticing that some iterators do Py_CLEAR(obj->it) as soon as the next() call fails, while other objects have internal iterators that are never NULL. My concern is that batched() clears the iterator sometimes (when the first next() of a batch fails), but not other times (when a subsequent next() fails).

So I would think to do one or the other, but not both:

  1. Clear the iterator immediately after any internal next() call fails, similar to pairwise, islice, chain, and cycle (sort of).

  2. Never clear the iterator and remove the if (it == NULL) check altogether, similar to dropwhile, takewhile, accumulate, compress, zip and enumerate. Unfortunately, it would mean allocating and throwing away PyList_New(n), but that's in the rare case that someone is abusing the iterator.

Maybe removing the Py_CLEAR and the NULL check is most natural?

Copy link
Member

@sweeneyde sweeneyde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of the decision about Py_CLEAR, I think this is a good improvement.

Out of curiosity, before this PR, I measured consume(batched(repeat(None, 1_000_000), 100)) to have a bimodal distribution of timings, with about half of runs at 41ms and half at 8ms (cache effects?). After this PR, I measured about 3ms consistently. This was on Windows, without PGO.

@rhettinger
Copy link
Contributor Author

rhettinger commented Oct 20, 2022

Thanks for the interesting line of thought. I like having this match the pure python equivalent. Also, I don't think that it would ever matter. Code like enumerate() and zip() has been deployed for 20 years with no ill effects. More-itertools has iterators than also succumb to your "strange iterator" and no one seems have had an issue.

The likely reason that you're getting a bimodal timing distribution is that PyList_Append() calls realloc() which uncontrollably runs in either of two modes, fast for an inplace resize when the memory can be extended inplace, and slow for a resize that has to move all the data because there isn't room for an inplace resize. The new code doesn't have that issue because PyList_New() makes a single call to malloc() which never moves data. The new code also has better memory utilization because it never returns an overallocated list.

carljm added a commit to carljm/cpython that referenced this pull request Oct 20, 2022
* main: (40 commits)
  pythongh-98461: Fix source location in comprehensions bytecode (pythonGH-98464)
  pythongh-98421: Clean Up PyObject_Print (pythonGH-98422)
  pythongh-98360: multiprocessing now spawns children on Windows with correct argv[0] in virtual environments (pythonGH-98462)
  CODEOWNERS: Become a typing code owner (python#98480)
  [doc] Improve logging cookbook example. (pythonGH-98481)
  Add more tkinter.Canvas tests (pythonGH-98475)
  pythongh-95023: Added os.setns and os.unshare functions (python#95046)
  pythonGH-98363: Presize the list for batched() (pythonGH-98419)
  pythongh-98374: Suppress ImportError for invalid query for help() command. (pythongh-98450)
  typing tests: `_overload_dummy` raises `NotImplementedError`, not `RuntimeError` (python#98351)
  pythongh-98354: Add unicode check for 'name' attribute in _imp_create_builtin (pythonGH-98412)
  pythongh-98257: Make _PyEval_SetTrace() reentrant (python#98258)
  pythongh-98414: py.exe launcher does not use defaults for -V:company/ option (pythonGH-98460)
  pythongh-98417: Store int_max_str_digits on the Interpreter State (pythonGH-98418)
  Doc: Remove title text from internal links (python#98409)
  [doc] Refresh the venv introduction documentation, and correct the statement about VIRTUAL_ENV (pythonGH-98350)
  Docs: Bump sphinx-lint and fix unbalanced inline literal markup (python#98441)
  pythongh-92886: Replace assertion statements in `handlers.BaseHandler` to support running with optimizations (`-O`) (pythonGH-93231)
  pythongh-92886: Fix tests that fail when running with optimizations (`-O`) in `_test_multiprocessing.py` (pythonGH-93233)
  pythongh-92886: Fix tests that fail when running with optimizations (`-O`) in `test_py_compile.py` (pythonGH-93235)
  ...
@rhettinger rhettinger deleted the better_batched branch October 20, 2022 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage skip news
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants