Skip to content

Chapter 5: Sorting Without a Hat

George Heineman edited this page Jul 24, 2021 · 10 revisions

The following are the code listings for Chapter 5. [book.py]


Listing 5-1. Selection Sort

def selection_sort(A):
  N = len(A)
  for i in range(N-1):                     ❶
    min_index = ifor j in range(i+1, N):
      if A[j] < A[min_index]:              ❸
        min_index = j

    A[i],A[min_index] = A[min_index],A[i]  ❹

❶ Before each pass through the i for loop, you know A[0 .. i-1] is sorted.
min_index is the index location of the smallest value in A[i .. N-1].
❸ If any A[j] < A[min_index], then update min_index to remember index location for this newly discovered smallest value.
❹ Swap A[i] with A[min_index] to ensure that A[0 .. i] is sorted.

Listing 5-2. Insertion Sort

def insertion_sort(A):
  N = len(A)
  for i in range(1,N):           ❶
    for j in range(i,0,-1):      ❷
      if A[j-1] <= A[j]:         ❸
        break
      A[j],A[j-1] = A[j-1],A[j]  ❹

❶ Before each pass through the i for loop, you know A[0 .. i-1] is sorted.
❷ Decrement j from index location i back to 0 but not including 0.
❸ If A[j-1]A[j], then A[j] has found its proper spot, so stop.
❹ Otherwise, swap these out-of-order values.

Listing 5-3. Providing a comparator function to a sorting algorithm

def insertion_sort_cmp(A, less=lambda one,two: one <= two):
  N = len(A)
  for i in range(1,N):
    for j in range(i,0,-1):
      if less(A[j-1], A[j]):      ❶
        break
      A[j],A[j-1] = A[j-1],A[j]

❶ Determine sorting order using a provided comparator function, less. If less(A[x],A[y]) is True, then A[x] should appear before A[y].

Listing 5-4. Recursive implementation of factorial

def fact(N):
  if N <= 1:             ❶
    return 1
  return N * fact(N-1)   ❷

❶ Base case: return 1 for fact(1) or any N ≤ 1.
❷ Recursive case: recursively compute fact(N–1) and multiply its result by N.

Listing 5-5. Recursive algorithm to find largest value in unordered list

def find_max(A):

  def rmax(lo, hi):
    if lo == hi: return A[lo]   ❷

    mid = (lo+hi) // 2L = rmax(lo, mid)           ❹
    R = rmax(mid+1, hi)         ❺
    return max(L, R)            ❻

  return rmax(0, len(A)-1)      ❶

❶ Invoke the initial recursive call with proper arguments for lo and hi.
❷ Base case: when lo == hi, the range A[lo .. hi] contains a single value; return it as the largest value.
❸ Find midpoint index location in the range A[lo .. hi]. Use integer division // in case range has odd number of values.
L is the largest value in the range A[lo .. mid].
R is the largest value in the range A[mid+1 .. hi].
❻ The largest value in A[lo .. hi] is the maximum of L and R.

Listing 5-6. Idea for sorting recursively

def sort(A):

  def rsort(lo, hi):       ❶
    if hi <= lo: returnmid = (lo+hi) // 2
    rsort(lo, mid)         ❸
    rsort(mid+1, hi)
    merge(lo, mid, hi)     ❹

  rsort(0, len(A)-1)

❶ Recursive helper method to sort A[lo .. hi].
❷ Base case: a range with one or fewer values is already in sorted order.
❸ Recursive case: sort the left half of A and the right half of A.
❹ Merge both sorted halves of the array in place.

Listing 5-7. Recursive Merge Sort implementation

def merge_sort(A):
  aux = [None] * len(A)            ❶

  def rsort(lo, hi):
    if hi <= lo: returnmid = (lo+hi) // 2
    rsort(lo, mid)                 ❸
    rsort(mid+1, hi)
    merge(lo, mid, hi)

  def merge(lo, mid, hi):
    aux[lo:hi+1] = A[lo:hi+1]      ❹

    left = loright = mid+1

    for i in range(lo, hi+1):
      if left > mid:               ❻
        A[i] = aux[right]
        right += 1
      elif right > hi:             ❼
        A[i] = aux[left]
        left += 1
      elif aux[right] < aux[left]: ❽
        A[i] = aux[right]
        right += 1
      else:
        A[i] = aux[left]           ❾
        left += 1

  rsort(0, len(A)-1)               ❿

❶ Allocate auxiliary storage equal in size to original array.
❷ Base case: with 1 or fewer values, there is nothing to sort.
❸ Recursive case: sort left and right sub-arrays and then merge.
❹ Copy sorted sub-arrays from A into aux to prepare for merge.
❺ Set left and right to be the starting index positions of the corresponding sub-arrays.
❻ When left sub-array is exhausted, take value from right sub-array.
❼ When right sub-array is exhausted, take value from left sub-array.
❽ When right value is smaller than left value, take value from right sub-array.
❾ When left value is smaller than or equal to right value, take value from left sub-array.
❿ Invoke the initial recursive call.

Listing 5-8. Recursive Quicksort implementation

def quick_sort(A):

  def qsort(lo, hi):
    if hi <= lo:                                ❶
      return

    pivot_idx = lolocation = partition(A, lo, hi, pivot_idx)  ❸

    qsort(lo, location-1)                       ❹
    qsort(location+1, hi)

  qsort(0, len(A)-1)                            ❺

❶ Base case: with 1 or fewer values, there is nothing to sort.
❷ Choose A[lo] as the pivot value, p.
❸ Return location in A such that:

  • A[location] = p
  • All values in left sub-array A[lo .. location–1] are all ≤ p
  • All values in right sub-array A[location+1 .. hi] are all ≥ p

❹ Recursive case: sort in place left and right sub-arrays, since p is already in its proper sorted location, A[location].
❺ Invoke the initial recursive call.

Listing 5-9. Heap Sort implementation

class HeapSort:
  def __init__(self, A):
    self.A = A
    self.N = len(A)

    for k in range(self.N//2, 0, -1):      ❷
      self.sink(k)

  def sort(self):
    while self.N > 1:                      ❸
      self.swap(1, self.N)                 ❹
      self.N -= 1self.sink(1)                         ❻

  def less(self, i, j):
    return self.A[i-1] < self.A[j-1]       ❶

  def swap(self, i, j):
    self.A[i-1],self.A[j-1] = self.A[j-1],self.A[i-1]

❶ To ensure that i // 2 computes the parent index location for i, both less() and swap() subtract 1 from i and j, as if they were using 1-based indexing.
❷ Convert array to be sorted, A, into a max binary heap in bottom-up fashion, starting at N//2, the highest index position that has at least one child.
❸ The while loop continues as long as there are values to sort.
❹ Dequeue maximum value by swapping with last value in heap.
❺ Reduce size of heap by one for upcoming sink() to work.
❻ Sink the newly swapped value into its proper location, which reestablishes the heap-ordered property.

Listing 5-10. Basic Tim Sort implementation

def tim_sort(A):
  N = len(A)                                 ❶
  if N < 64:
    insertion_sort(A,0,N-1)
    return

  size = compute_min_run(N)                  ❷
  for lo in range(0, N, size):               ❸
    insertion_sort(A, lo, min(lo+size-1, N-1))

  aux = [None]*Nwhile size < N:
    for lo in range(0, N, 2*size):
      mid = min(lo + size - 1, N-1)          ❺
      hi  = min(lo + 2*size - 1, N-1)
      merge(A, lo, mid, hi, aux)             ❻

    size = 2 * size

❶ Small arrays are sorted instead using Insertion Sort.
❷ Compute size—a value typically between 32 and 64—to use for the length of the sub-arrays to be sorted.
❸ Use Insertion Sort to sort each sub-array A[lo .. lo+size–1], handling special case when final sub-array is smaller.
merge() uses extra storage equal in size to the original array.
❺ Compute index positions for two sub-arrays to be merged, A[lo .. mid] and A[mid+1 .. hi]. Take special care with partial sub-arrays.
❻ Merge sub-arrays together to sort A[lo .. hi] using aux for auxiliary storage.
❼ Once all sub-arrays of length size are merged with another, prepare for next iteration through while loop to merge sub-arrays twice as large.