This benchmark consists of the submissions to Track A of the Powersort Competition.
The goal of this repository is to provide a benchmark for run-adaptive sorting methods. It consists of a dataset of input orderings (permutations)that demonstrates the differences between the Powersort merge policy and the orignial Timsort merge policy.
Timsort and Powersort are adaptive sorting algorithms: both are faster if the input has long presorted areas (“runs”). However, they differ in the merge policy, i.e., the order in which they combine these naturally occurring into longer runs.
Powersort solves this task of finding good merge trees by implicitly solving an optimization problem looking for a nearly optimal binary search tree; Timsort uses local rules based on the lengths of runs. A much more detailed description of the differences between Timsort and Powersort can be found in listsort.txt in the CPython source code. A short proof about correctness and mergecost of Powersort appears in the Multiway Powersort paper.
The task in Track A of the competition was to find orders of input lists, where the merge policy of Timsort and Powersort have (largely) different cost. Each submission is a text file with a single list of elements, separated by commas (a Python list expression), e.g., “[11, 12, 13, 14, 1, 2, 3]”.
For more information and the ranking of submissions, see the competition website.
The inputs in this repository are provided as is, without any warranty. By using them, you agree to the license terms.
The inputs were originally submitted to the Powersort Competition various authors, in particular
- Daniel Chadwick
- Ziad Ismaili Alaoui
- Christopher Jefferson
- Thomas Johnson
- Vincent Jugé