A quick study sheet I use as a refresher 😄
Table of Contents generated with DocToc
- Data Structures
- Algorithms
- Other Concepts
- Math
- Common Problems
- Just Python Things
- Problem-solving Strategies
- An array is a collection with specified size
- Dynamic array: some languages' implementations automatically expand as you add elements
- Access elements directly by index
- Time complexity:
- Access by index:
O(1) - Search by value:
O(n) - Insert:
O(n)(need to shift values) - Delete:
O(n)(need to shift values)
- Access by index:
- A linked list is a collection of nodes where each node has a value and a reference
- Singly-linked list: nodes have pointers to the next node
- Doubly-linked list: nodes have pointers to next and previous nodes
- Time complexity:
- Access by index:
O(n) - Search by value:
O(n) - Insert:
O(1) - Delete:
O(1)
- Access by index:
- Stack: last in, first out (LIFO)
- Queue: first in, first out (FIFO)
- Double-ended queue: stack + queue combined
- Principal operations:
pushadds elements &popextracts elements
- A tree is an undirected, connected, acyclic graph
- Has
vvertices andv-1edges - Any two vertices are connected by a unique path
- A leaf is a vertex of degree 1
- One node is designated as the root
- Each node has parent and/or children pointers
- A node's height is the length of its path to the root
- Has
- A forest has multiple distinct trees (a disjoint union)
- An n-ary tree has at most
nchildren per node
- A binary tree has nodes with at most 2 children (designated left & right)
- Full: every node has 0 or 2 children
- Number of nodes is at most
2^(h+1)-1
- Number of nodes is at most
- Complete: every level, except possibly the last, is filled, and the last level's nodes are as far left as possible
- Number of internal nodes:
floor(n/2)
- Number of internal nodes:
- Balanced: has the minimum possible maximum depth
- Height is
ceil(lg(n+1))
- Height is
- Traversals:
- Pre-order: open current, visit left subtree, visit right subtree
- In-order: visit left subtree, open current, visit right subtree (returns sorted list)
- Post-order: visit left subtree, visit right subtree, open current
- Level-order: breadth-first traversal, level by level
- A binary search tree is an ordered binary tree
- Satisfies the BST property: each node's value is greater than all keys stored in the left subtree and less than all keys stored in the right subtree
- Designed to make searching faster--each comparison allows operations to skip about half the tree
- Search: recursively search subtrees; takes
O(h) - Insertion: like search, but insert node when a leaf is reached; takes
O(h) - Deletion: more complicated; takes
O(h)- If deleting a node with no children, just do it
- If deleting a node with a single child, replace the node with its subtree
- If deleting a node with two children, swap with minimum value in right subtree or maximum value in left subtree, then delete the node (which should now be a leaf)
- Self-balancing binary search tree
- Lookup, insertion, and deletion all take
O(log n)
- Also called a prefix tree
- Stores subsequences of values; useful for autocomplete
- Hash function: a function mapping an object to an integer such that if
a==b,H(a)==H(b) - A hash table is an array whose indices correspond to results from a hash function (implemented as a dictionary in Python)
- Provides
O(1)lookup - Collision resolution & table doubling
- Special tree where nodes have higher (in the case of a min-heap) values than their parents
- Binary heap:
- Heapify in
O(n) - Find min in
O(1) - Extract min, increase key, insert, delete in
O(log n) - Can implement as a list where a node at index
ihas children at2i+1and2i+2
- Heapify in
- Collection of nodes and edges
- Spanning tree: a tree that includes all nodes in the graph
- Minimum spanning tree: a spanning tree with minimum total edge weights
- Given a sorted list, start at the midpoint and divide and conquer
O(log n)
- Maintain a sorted sublist and insert new elements in it appropriately
- Sorts in-place; stable
- Best-case
O(n), averageO(n^2), worstO(n^2)
- On each pass through the array, compare adjacent pairs of elements and swap if necessary
- Sorts in-place; stable
- Best-case
O(n), averageO(n^2), worstO(n^2)
- Exchange current element with smallest element to the right of the current element
- Sorts in-place; unstable
- Best-case
O(n^2), averageO(n^2), worstO(n^2)
- Recursively divide until sublists are size 1, then recursively merge the sublists
- Requires
O(n)space; stable - Best-case
O(n log n), averageO(n log n), worstO(n log n)
- Set a pivot in the array; move elements smaller than pivot to its left and elements larger to the right
- Recursively sort left and right sublists
- Requires
O(log n)space; stable - Best-case
O(n log n), averageO(n log n), worstO(n^2)
- For lists whose elements' values are in a set range
- Not a comparison sort so best & average is
O(n+k)and worst isO(n^2) - Iterate through list and place items in buckets; can be stable
- Apply a stable counting sort to every place value in a number
- Sort places from least to most significant
- Requires
O(n+k)space;O(d(n+k))time - Also not a comparison sort
- Given a graph, find a path from a start node to an end node
- General strategy: expand a node, check to see if it's the goal node, add its children to the search agenda
- In the case of weighted graphs, a heuristic may help find the shortest path faster
- Admissible: heuristic's value for a node is less than actual distance from node to goal (
H(n,G) ≤ dist(n,G)for all nodesn) - Consistent: heuristic follows triangle inequality (
|H(A)-H(B)| ≤ dist(A,B)for all nodesA,B)
- Admissible: heuristic's value for a node is less than actual distance from node to goal (
- Implement with a stack (add new paths to the front of the agenda)
- Implement with a queue (add new paths to the end of the agenda)
- In an unweighted graph, guaranteed to find shortest path
- Add new paths to the front of the agenda
- Sort new paths by terminal node's heuristic
- Add new paths to the front of the agenda
- Sort all paths in agenda by terminal node's heuristic
- Add new paths to the front of the agenda
- Sort agenda by path length so far
- Can also add a heuristic or extended set (or both)
- Branch and bound with heuristic and extended set
- Heuristic must be consistent
- Find shortest path between two nodes (branch and bound with extended set and without heuristic)
- Can't handle negative edge weights
- Using a Fibonacci heap, runtime is
O(|E|+|V|log|V|)
- Generate a minimum spanning tree of a graph
- Using a heap, runtime is
O(|E|+|V|log|V|)
- Compute shortest paths from a single source to all other nodes in the graph
- Can handle negative edge weights & detect negative-weight cycles
- Worst-case runtime is
O(|V||E|)
- Dynamic programming all-pairs shortest paths algorithm
dp(i,j,k+1)=min(dp(i,j,k),dp(i,k+1,k)+dp(k+1,j,k))
- Static/dynamic checking
- Strongly/weakly typed
- Compiled/interpreted
- Shallow/deep copying
- Immutable/mutable
- Greedy algorithms
- Defensive copying
- Pseudo-polynomial runtime
- Look here for formal definitions
- O - asymptotic upper bound
- o - asymptotic upper bound, excluding same rate
- Ω - asymptotic lower bound
- ω - asymptotic lower bound, excluding same rate
- Θ - same asymptotic growth
- Exponential > polynomial > logarithmic > constant
- Can ask for worst, best, or average case
Inspiration from here
- Abstract data type: access data only through a public interface; implementation can be changed without altering usage
- Class: basic concept in OOP, bundles data type information with actions
- Object: runtime value which belongs to a class
- Encapsulation: information hiding to ensure data integrity
- Hierarchy: classes can have super- and subclasses
- Inheritance: a subclass inherits data and methods from its parent classes
- Overriding: a subclass inherits parent methods but may override them
- Polymorphism: different classes in a program can respond to the same message in different ways; useful when an object's class can't be determined at compile time
- TODO
- Model-view-controller: model stores data, controller updates model, view generates user interface
- Factory method: use a factory object to create other objects rather than using a constructor
- Singleton: restrict instantiation of a class to a single object
- Observer: subjects notify observers of any state changes (usually by calling their methods); used in MVC
- Lots more
- A general method for solving a problem with optimal substructure by breaking it down into overlapping subproblems
- Top-down: memoize (store) solutions to subproblems and solve problem recursively
- Bottom-up: build up subproblems from base case up and avoid recursive overhead
- Order subproblems by topologically sorting DAG of dependencies
- Knapsack problem, longest common subsequence, coin change, edit distance, minimum number of jumps, longest palindrome substring, balanced partition
- GET: used to retrieve data, no other effect on the data
- POST: used to send data to the server (e.g. form)
- PUT: replaces current representation of resource (idempotent)
- DELETE: remove current representation resource
- 200 OK: success
- 400 Bad Request: syntax could not be understood
- 401 Unauthorized: request not fulfilled due to lack of authorization
- 403 Forbidden: request understood but not fulfilled, authorization will not help
- 404 Not Found: URI could not be matched
- 408 Request Timeout: server did not receive a timely response from client
- 500 Internal Server Error: server exception
- 503 Service Unavailable: server unable to handle the request (temporary)
- 504 Gateway Timeout: server did not receive a timely response from upstream server
- Master theorem: is most work performed in the root node, in the leaves, or evenly distributed in the rows of the recursion tree?
- TODO
init: creates/initializes.gitfolder in current directoryclone: clone repo into new directorypull: fetch from another repo and integrategit pullis same asgit fetchthengit merge FETCH_HEAD
add: add files to index of contents for next commitrm: remove files from working tree and indexcommit: record changes to the repo, along with a commit messagerebase: transplant changes on one branch to anotherbranch: list, create, or delete branchescheckout: switch branchesstatus: show the working tree's statusdiff: show changes between commits or the working treelog: show commit logs in a reporemote: manage tracked remote reposreset: reset current HEAD to a different state
n(n-1)/2: number of handshakes in a groupn-1: number of rounds in a knockout tournament2^k: number of binary strings of lengthkn!/(n-r)!: permutations ofnitems takenkat a timen!/(r!(n-r)!): combinations ofnitems takenkat a time
Lots of these taken from this blog.
-
Fibonacci sequence: print the
nth Fibonacci number- Optimally, do this recursively and cache the subproblem solutions
-
Array pair sums: given an array, output pairs which sum to a number
k- Can do in
O(n)with a set data structure. For each element in the array, check to see ifk-a[i]is in the set, then add the element to a set.
- Can do in
-
Reverse a linked list: reverse a singly linked list
- Track previous and current nodes; iterate through list and swap the direction of pointers. Time is
O(n)and space isO(1).
- Track previous and current nodes; iterate through list and swap the direction of pointers. Time is
-
Matrix region sum: given multiple rectangular regions in a matrix, compute the sum of numbers in that region
- Memoize sums of regions with the constraint that corners are at
m[0][0]
- Memoize sums of regions with the constraint that corners are at
-
Word permutation: find all permutations of a word
```python def permute(word): if len(word) == 1: return {word} else: result = set() permutations = permute(word[:-1]) letter = word[-1] for p in permutations: result.update([p[0:i]+letter+p[i:] for i in range(0,len(word)+1)]) return result ``` -
Median of number stream: given a continuous stream of numbers, find the median of numbers so far at any time
- Optimally, keep a max-heap of the smaller half of the numbers and a min-heap of the larger half of the numbers
-
Infinite array search: given a sorted, infinite-length array, find a given value
- Modify binary search to start at the array's first element and exponentially increase the index you search at. Time is
O(log n)
- Modify binary search to start at the array's first element and exponentially increase the index you search at. Time is
-
Anagram pair: determine if two words are anagrams
- Comparison sort: sort the words in alphabetical order and check for equality.
O(n log n), wherenis word length. - Count letters: use a hash table to track counts of letters in both words.
O(n)runtime.
- Comparison sort: sort the words in alphabetical order and check for equality.
-
Anagram dictionary: determine which words in a list are anagrams of a given word
- Check for the membership of every permutation of the input word in the dictionary
-
Anagram list: determine which sets of words in a dictionary are anagrams
- Abstractly, hash each word and group by word. A hash can be a 26-digit string, or you can sort each word.
-
Binary search tree verification: verify whether a tree satisfies the binary search tree property
- For each node, track its possible minimum and maximum values
- Performing an inorder traversal should produce a sorted list
-
Largest continuous sum: in an array of integers, determine the subsequence with the largest sum
- Track maximum sum encountered so far and check whether current sum is greater. Reset current sum when it becomes negative. Time is
O(n)and space isO(1).
- Track maximum sum encountered so far and check whether current sum is greater. Reset current sum when it becomes negative. Time is
-
-1/0/1 array: given an array where values are -1, 0, or 1, sort the array
- Bucket sort (but this takes
O(n)space) - Iterate through the list and track pointers for min and max index. If a value is -1, swap it with the element at the min index and increment min index. If a value is 1, swap it with the element at max index and decrement max index. Time is
O(n)and space isO(1).
- Bucket sort (but this takes
-
k-th largest element: find the
kth largest element in an unsorted array- Modify quicksort to recursively sort on pivots in left/right subarrays (average
O(n), worst-caseO(n^2)) - Median of medians algorithm
- Modify quicksort to recursively sort on pivots in left/right subarrays (average
-
Find missing number: given an array where every number except one appears an even number of times, find the number that appears an odd number of times
- Optimally, bitwise XOR by numbers in the list (XORing an even number of times resets the number to its original value). Time is
O(n)and space isO(1)
- Optimally, bitwise XOR by numbers in the list (XORing an even number of times resets the number to its original value). Time is
-
Knapsack: given a set of items each with weights and values, maximize value while keeping total weight under a limit
- Dynamic programming: say weight limit is
W. Create an arraym[w]where each element is the maximum value of items with a weight limitw≤W. Optimize by dividing item weights and weight limit by their greatest common divisor. RuntimeO(nW).
- Dynamic programming: say weight limit is
-
Balanced partition: given a set of numbers, partition them so that the sums of the partitions are as close as possible
- Greedy method: iterate through sorted list and add items to the smaller-sum partition
- Dynamic programming: determine if a subset of the input sums to
n/2(wherenis the sum of the input numbers)
s.center(w,[fillchar]): returns centered string in string of widthws.count(sub[,start[,end]]): returns count of non-overlapping occurences of substringsub in s: returnsTrueifsubis inss.find(sub[,start[,end]]): returns start index of substring or-1s.join(iter): join items in iterable, separated byss.strip([chars]): removing leading and trailing characterss.replace(old,new[,count]): returns copy ofswitholdreplaced bynews.isalpha(): returnsTrueif all characters insare alphabetics.isdigit(): returnsTrueif all characters insare digits
l=[]: initializelen(l): get sizel.append(val): append a valuel.insert(i,val): insert a value at positionl.extend(lst): append all values in a listl.pop([i]): remove an item and return it (defaults to last item)x in l: check membershipl.sort(cmp=None,key=None,reverse=False): sort in placesorted(iterable[, cmp[, key[, reverse]]]): return a new stably sorted listl.reverse(): reverse a list in placerange(start,end): get a list with items fromstart(inclusive) toend(exclusive)[<expr> for <var> in <list> if <condition>]: list comprehensionlistname[start:end:slice_size]: slicing
set()or{l}: initializelen(s): get cardinalityx in s: check memberships.update(other): add values ofothertoss | other | ...: return a union of setss & other & ...: return an intersection of setss - other - ...: return difference of setss ^ other: return set of elements uniquely in sets
d={}: initialized[key]ord.get(key): get the value atkey(the latter returnsNoneif not found)len(d): get item countkey in d: check membershipd.pop(key): remove and return a value in the dictionarydel d[key]: delete an itemd.update(other): update/overwrite with keys & values fromotherd.items(): return a list of(key,value)tuplesd.keys(): return a list of dicionary keysd.values(): return a list of dictionary values
- Binary numbers: preface with
0b; usebin(int)to convert- Left and right shift:
<<and>> - Bitwise AND, OR, XOR, NOT:
&,|,^,~ - Bitmasks
- Left and right shift:
f = open('filename', 'w'): open a filef.close(): close filef.readline(): read a line from the filefor line in f: iterate through lines in filef.write(): write a string to the file
x << y: left shift byybitsx >> y: right shift byybitsx & y: bitwise ANDx | y: bitwise ORx ^ y: bitwise XOR~x: complement ofx
__init__(self,[...]): initializer for a class__cmp__(self,other): return negative for<, 0 for==, positive for>__eq__(self,other): define behavior for==- Also
ne,lt,le,gt,ge
- Also
__str__(self): return string representation__repr__(self): return machine-readable representation__hash__(self): return an integer such thata==bimplieshash(a)==hash(b)
copycopy.copy(x): return shallow copy ofxcopy.deepcopy(x): return deep copy ofx
collections(usecollections.deque)dq.pop(),dq.popleft(),dq.appendleft(val),dq.extendleft(lst),dq.rotate(n)
heapqheapq.push(heap,item): add an itemheapq.pop(heap): pop an itemheapq.heapify(l): make a list into a heap in linear time
BeautifulSoupscipynumpyscikit-learnnltkrequestsunirestpdbpdb.set_trace()sets a breakpoint at the current line and gives the user a CLI with which to inspect various objects and their values at runtime. Also allows you to continue code execution line by line. Usehelpto see a list of the commands usable in the CLI.
pprint: Pretty Printpprint.pprint(iter): Print out a version ofiterwith JSON-like formatting. Useful for inspecting large, deeply nested objects.
zip(seq1 [,seq2 [...]]): return list of tuples where each tuple contains the i-th element from each sequence. Truncated to length of shortest sequence ([(seq1[0], seq2[0] ...), (...)])map(f, seq): return list of the results offapplied to each element ofseq([f(seq[0]), f(seq[1]), ...])filter(f, seq): return list of items inseqfor whichf(seq[i]) == Truereduce(f, seq): applyfto pairs of elements insequntil iterable is a single value
- Infinity:
float("inf") - Simultaneous assignment:
a,b = b,ato swap lambda x: <body>: lambda function; don't need return statement- Tuples are immutable lists
zip()- Four numeric types:
int,long,float,complex - Falsey values:
NoneFalse- Zero of any numeric type
- Empty sequences & mappings
- When
__nonzero__()returnsFalse - When
__len__()returns zero for a user-defined class