learning algorithm from Robert Sedgewick
- dynamic connectivity
- quick find
- quick union
- improvements
- applications
-
Steps to developing a usable algorithm.
- Model the problem.
- Find an algorithm to solve it.
- Fast enough? Fits in memory?
- If not, figure out why.
- Find a way to address the problem.
- Iterate until satisfied.
-
The scientific method.
-
Mathematical analysis.
- stacks
- resizing arrays
- queues
- generics
- iterators
- applications
- stack
- void push(String item)
- String pop()
- boolean isEmpty()
- Tradeoffs. Can implement a stack with either resizing array or linked list; client can use interchangeably. Which one is better?
Linked-list implementation.
- Every operation takes constant time in the worst case.
- Uses extra time and space to deal with the links.
Resizing-array implementation.
-
Every operation takes constant amortized time.
-
Less wasted space.
-
queue
- void enqueue(String item)
- String dequeue()
- boolean isEmpty()
-
bag
- void add(Item x)
- int size()
- Iterable iterator()
- mathematical models
- order-of-growth classifications
- theory of algorithms
- memory
the mostly common running time: 1, log N, N, NlogN, N^2, N^3, and 2^N
binary Search implement in Java - Arrays.binarySearch
- Goals.
- Establish "difficulty" of a problem.
- Develop "optimal" algorithms
- Approach.
- Suppress details in analysis: analyze "to within a constant factor".
- Eliminate variability in input model by focusing on the worst case.
- Optimal algorithm.
- Performance guarantee (to within a constant factor) for any input.
- No algorithm can provide a better performance guarantee.
example: Three-sum problem
Start.
- Develop an algorithm
- Prove a lower bound.
Gap?
- Lower the upper bound (discover a new algorithm).
- Raise the lower bound (more difficult).
Golden Age of Algorithm Design.
- 1970s-.
- Steadily decreasing upper bounds for many important problems.
- Many known optimal algorithms.
Caveats.
- Overly pessimistic to focus on worst case?
- Need better than "to within a constant factor" to predict performance.
typical memory usage for objects in Java
- Object overhead: 16bytes
- Reference: 8bytes
- Padding: Each object uses a multiple of 8 bytes.
- Primitive type: 4 bytes for int, 8 bytes for double, ...
- Object reference: 8 bytes.
- Array: 24 bytes + memory for each array entry.
- Object: 16 bytes + memory for each instance variable + 8 if
- inner class (for pointer to enclosing class).
- Padding: round up to multiple of 8 bytes.
Shallow memory usage: Don't count referenced objects.
Deep memory usage: If array entry or instance variable is a reference, add memory (recursively) for referenced object.
Turning the crank: summary
1.Empirical analysis. 2.Mathematical analysis. 3.Scientific method.
- In iteration i, find index min of smallest remaining entry.
- Swap a[i] and a[min]
- In iteration i, swap a[i] with each larger entry to its left.
- Generate a random real number for each array entry.
- Sort the array.
The convet hull of a set of N points is the smallest perimeter fence enclosing the points.
-
mechanical algorithm. Hammer nails perpendicular to plane; stretch elastic rubber band around points.
-
application: motion planning. Robot motion planning. Find shortest path in the plane from s to t that avoids a polygonal obstacle.
-
application: farthest pair. Civen N points in the plane, find a pair of points with the largest Euclidean distance between them.
Basic plan.
- Divide array into two havles.
- Recursively sort each half.
- Merge two halves.
Basic plan.
- Pass through array, merging subarrays of size 1.
- Repeat for subarrays of size 2, 4, 8, 16..
Computational complexity. Framework to study efficientcy of algorithms for solving a particular problem X.
Lower bound may not hold if the algorithm has information about:
- The initial order of the input.
- The distribution of key values.
- The representation of the keys.
- insertion sort is stable.
- selection sort is not stable.
- shell sort is not stable.
- merge sort is stable.
Basic plan.
- Shuffle the array. Shuffling is needed for performance guarantee.
- Partition so that, for some j
- entry a[j] is in place
- no larger entry to the left of j
- no smaller entry to the right of j
- Sort each piece recursively.
Worst case. Number of compares is quadratic
Average case. Number of compares is ~ 1.39 NlgN.
- 39% more compares than mergesort.
- But faster than mergesort in practice because of less data movement.
Random shuffle
- Probabilistic guarantee against worst case.
- Basis for math model that can be validated with experiments.
Proposition. Quicksort is an in-place sorting algorithm.
- Partitioning: constant extra space.
- Depth of recursion: logarithmic extra space (with high probability)
Proposition. Quicksort is not stable.
Median of sample.
- Best choice of pivot them = median
- Estimate true median by taking median of sample.
- Median-of-3(random) items.
Goal. Given an array of N items, find the top k largest Ex. Min(k=0), max(k=N-1), median(k=N/2)
Applications.
- Order statistics.
- Find the 'top k'.
Repeat in one subarray, depending on j; finished when j equals k.
Proposition. Compare-based selection algorithm whose worst-case running time is linear.
Mistake. Put all items equal to the partitioning item on one side. Consequence. ~ 1/2 N^2 compares when all keys equal.
Recommended. Stop scans on itmes equal to the partitioning item. Consequence. ~NlgN compares when all keys equal.
- let v be partiitnoning item a[lo]
- Scan i from left to right.
- (a[i] < v): exchange a[lt] with a[i]: increment both lt and i
- (A[i] > v): exchange a[gt] with a[i]: decrement gt
- (a[i] == v): increment i
-
Java Arrays.sort()
-
C qsort()
Internal sorts:
- Insertion sort, selection sort, bubble sort, shaker sort.
- Quick sort, merge sort, heap sort, sample sort, shell sort.
- Solitaire sort, red-black sort, splay sort, Yaroslavskiy sort, p-sort,...
External sorts. Poly-phase merge sort, cascade-merge, oscillating sort.
Collections.. Insert and delete items. Which item to delete?
Stack. Remove the item most recently added. Queue. Remove the item least recently added. Randomized queue. Remove a random item. Priority queue. Remove the largest(or smallest) item.
Requirement。 Generic items are Comparable.
MaxPQ()
void insert(Key v)
Key delMax()
boolean isEmpty()