
Description
The PR in question is #38192.
@steveklabnik had some objections to new guarantees of the sort algorithm.
Old guarantees: "This sort is stable and O(n log n) worst-case but allocates approximately 2 * n where n is the length of self."
New guarantees: "This sort is stable and O(n log n) worst-case, but allocates temporary storage half the size of self."
The new version is ATM in nightly only, so we still have time to adjust it. I agree that saying "approximately half the size" would be slightly better.
However... should we perhaps leave the guarantees as they were (2 * n
), in order not to make alternative implementations non-compliant? What exactly should the documentation say?
Here's my stance on the issue:
First, sort algorithms should never ever allocate 2 * n
additional memory. That's just way too much and I'm strongly against 2 * n
. Even the old merge sort can be easily tweaked to allocate just n
without compromising performance.
Second, stable sorts can often get away with allocating just n / 2
. In fact, I successfully tweaked the old merge sort to allocate just n / 2
before I jumped on the TimSort ship.
Any stable sort that requires n
memory can be made to work with just n / 2
. Here's how. Split the input array in half. Sort each half using the allocated auxilliary buffer (which is of same length as the half itself). Move one of the halves into the buffer. Merge them into the original orray. Done.
Guarantees in some other programming languages:
- Python (source) - TimSort allocates
n / 2
. - Java (source) - TimSort allocates
n / 2
. - C++ (docs) -
std::stable_sort
allocatesn
.
I believe it's safe to say that the standard sort allocates approximately n / 2
, but would be okay with guaranteeing n
, too.
Libs team, what do you think? @bluss?