Source code for Fast Python (2020) by Chris Conlan
Paperback available for purchase on Amazon.
The following code profiles can be run as stand-alone scripts. They may or may not depend on explanation provided in the accompanying book.
- Binary search: binary_search.py
- Dictionary construction: build_dict.py
- Concatenating strings, string construction: concatenate_strings.py
- Counting the frequency of a value: count_occurrences.py
- Computing a cumulative sum: cumulative_sum.py
- The inoperator and early stopping: early_stopping.py
- Time series filters/convolutions: filters.py
- Find largest kvalues in a list: find_top_k.py
- List construction/declaration/flattening: flatten_lists.py
- Counting lines in a file: line_count.py
- Set intersection, finding matches in a list: match_within.py
- Matrix multiplication: matrix_multiplication.py
- Computing moving averages: moving_averages.py
- Counting frequency of a word in text: occurrences_of.py
- Looping through pd.DataFrameobjects: pandas_loops.py
- Sorting algorithms: sorting.py
- Low-level sorting algorithms: sorting_v2.py
- Adding a list of numbers: sum.py
Running them is simple ...
cd fast-python/src
python cumulative_sum.py
All the profiles use a simple profiling module in src/utils/profiler.py. It produces tables and charts like the following.
np_fast_cusum
    n   = 56234132 values
    t   = 201.806 ms
    n/t = 278653.8114 values per ms
np_fast_cusum
    n   = 100000000 values
    t   = 350.611 ms
    n/t = 285216.7553 values per ms
...
                function   n_values  t_milliseconds  values_per_ms
0             slow_cusum          1           0.012        85.0196
1             slow_cusum          3           0.005       640.7530
...
14            slow_cusum       5623        1298.218         4.3313
15            slow_cusum      10000        4140.327         2.4153
...
30   slow_cusum_expanded       5623        1878.419         2.9935
31   slow_cusum_expanded      10000        5767.316         1.7339
...
62     python_fast_cusum   56234132        5727.162      9818.8478
63     python_fast_cusum  100000000       10939.993      9140.7733
...
94     pandas_fast_cusum   56234132         442.652    127039.2437
95     pandas_fast_cusum  100000000         780.461    128129.3962
...
126     numba_fast_cusum   56234132         139.602    402816.3295
127     numba_fast_cusum  100000000         236.445    422930.9936
...
158        np_fast_cusum   56234132         201.806    278653.8114
159        np_fast_cusum  100000000         350.611    285216.7553
I use the profiler frequently in my own work. It allows me to analyze the relationship between computational complexity and raw execution time pretty easily.
I have included a dependencies.txt, but you should be fine with a blank Python 3 environment followed by ...
pip install numpy pandas numba joblib matplotlib pillow


