Skip to content

Commit e71d617

Browse files
committed
Update readme with new benchmark
1 parent 6209002 commit e71d617

File tree

1 file changed

+84
-58
lines changed

1 file changed

+84
-58
lines changed

README.md

Lines changed: 84 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -1,97 +1,123 @@
1-
selectlib
2-
=========
1+
# selectlib
32

4-
selectlib is a lightweight C extension module for Python that implements several inplace selection algorithms for efficiently finding the kth smallest element in an unsorted list. The module provides three main functions—nth_element, quickselect, and heapselect—that allow you to partition a list so that the element at a given index is in its final sorted position, without performing a full sort.
3+
selectlib is a lightweight C extension module for Python that implements several inplace selection algorithms for efficiently finding the kth smallest element in an unsorted list. The module provides three main functions—`nth_element`, `quickselect`, and `heapselect`—that allow you to partition a list so that the element at a given index is in its final sorted position, without performing a full sort.
54

5+
You can install selectlib using pip:
6+
7+
```bash
68
python -m pip install selectlib
9+
```
710

8-
Features
9-
--------
11+
## Features
1012

11-
• In‐place partitioning using three different strategies:
12-
- nth_element: An adaptive selection function that chooses the optimal strategy based on the target index. For small indices, it uses a heapselect method; otherwise, it starts with quickselect and falls back to heapselect if necessary.
13-
- quickselect: A classic partition-based selection algorithm that uses random pivots to position the kth smallest element in its correct sorted order. If the operation exceeds an iteration limit, it automatically falls back to heapselect.
14-
- heapselect: A heap-based approach that builds a fixed-size max-heap to efficiently locate the kth smallest element.
15-
• Performance as a feature! See below for benchmark
16-
• A benchmark script that runs multiple tests for varying list sizes and selection percentages, then produces visual output as grouped bar charts.
17-
• Compatible with Python 3.8 and later.
13+
- **In‑place partitioning using three different strategies:**
14+
- **`nth_element`:** An adaptive selection function that chooses the optimal strategy based on the target index. For small indices, it uses the heapselect method; otherwise, it starts with quickselect and falls back to heapselect if necessary.
15+
- **`quickselect`:** A classic partition‑based selection algorithm that uses random pivots to position the kth smallest element in its correct sorted order. If the operation exceeds an iteration limit, it automatically falls back to heapselect.
16+
- **`heapselect`:** A heap‑based approach that builds a fixed‑size max‑heap to efficiently locate the kth smallest element.
17+
- **Performance as a feature!**
18+
Selectlib comes with benchmark scripts that run multiple tests for varying list sizes and selection percentages, then produce visual output as grouped bar charts.
19+
- **Median Benchmarking:**
20+
In addition to the benchmark for selecting the k‑smallest elements, selectlib provides a dedicated median benchmark script (`benchmark_median.py`) that compares Python’s built‑in `statistics.median_low` with selectlib’s `nth_element`, `quickselect`, and `heapselect` methods for computing the median of a list. This benchmark runs the tests for list sizes ranging from 1,000 to 1,000,000 elements and displays the median computation performance in a grouped bar chart.
1821

19-
Usage Example
20-
-------------
22+
## Usage Example
2123

2224
Below is an example demonstrating how to use each of the three selection algorithms to find the kth smallest element in a list:
2325

24-
import selectlib
26+
```python
27+
import selectlib
2528

26-
data = [9, 3, 7, 1, 5, 8, 2]
27-
k = 3 # We wish to position the element at index 3, as in a sorted list
29+
data = [9, 3, 7, 1, 5, 8, 2]
30+
k = 3 # We wish to position the element at index 3, as in a sorted list
2831

29-
# Using nth_element:
30-
selectlib.nth_element(data, k)
31-
print("After nth_element, kth smallest element is:", data[k])
32+
# Using nth_element:
33+
selectlib.nth_element(data, k)
34+
print("After nth_element, kth smallest element is:", data[k])
3235

33-
# Reset the list for a fresh example:
34-
data = [9, 3, 7, 1, 5, 8, 2]
36+
# Reset the list for a fresh example:
37+
data = [9, 3, 7, 1, 5, 8, 2]
3538

36-
# Using quickselect:
37-
selectlib.quickselect(data, k)
38-
print("After quickselect, kth smallest element is:", data[k])
39+
# Using quickselect:
40+
selectlib.quickselect(data, k)
41+
print("After quickselect, kth smallest element is:", data[k])
3942

40-
# Reset the list:
41-
data = [9, 3, 7, 1, 5, 8, 2]
43+
# Reset the list:
44+
data = [9, 3, 7, 1, 5, 8, 2]
4245

43-
# Using heapselect:
44-
selectlib.heapselect(data, k)
45-
print("After heapselect, kth smallest element is:", data[k])
46+
# Using heapselect:
47+
selectlib.heapselect(data, k)
48+
print("After heapselect, kth smallest element is:", data[k])
49+
```
4650

47-
You can also provide an optional key function to selectlib’s functions to customize comparisons. For example, if you wish to determine the kth largest element rather than the kth smallest, simply negate the value in a lambda function:
51+
You can also provide an optional key function to customize comparisons. For example, if you wish to determine the kth largest element rather than the kth smallest, simply negate the value in a lambda function:
4852

49-
data = [15, 8, 22, 5, 13]
50-
k = 2
51-
selectlib.quickselect(data, k, key=lambda x: -x)
52-
print("The kth largest element is:", data[k])
53+
```python
54+
data = [15, 8, 22, 5, 13]
55+
k = 2
56+
selectlib.quickselect(data, k, key=lambda x: -x)
57+
print("The kth largest element is:", data[k])
58+
```
5359

54-
Benchmarking
55-
------------
60+
## Median Benchmarking
5661

57-
To help you understand the performance of each algorithm, selectlib comes with a comprehensive benchmark script (benchmark.py). The benchmark compares the following five methods:
62+
In addition to the k‑smallest elements benchmark, selectlib provides a median benchmark script named `benchmark_median.py`. This script compares the performance of the following methods for computing the median (using the low median for even‑length lists):
5863

59-
1. sort – Creates a sorted copy of the list and slices the first k elements.
60-
2. heapq.nsmallest – Uses Python’s standard library heap algorithm.
61-
3. quickselect – Partitions using selectlib.quickselect, then slices and sorts the first k elements.
62-
4. heapselect – Partitions using selectlib.heapselect, then slices sorts the first k elements.
63-
5. nth_element – Partitions using selectlib.nth_element, then slices and sorts the first k elements.
64+
1. **`median_low`** – Uses Python’s built‑in `statistics.median_low`.
65+
2. **`nth_element`** – Uses `selectlib.nth_element` to partition the list so that the median element is in place.
66+
3. **`quickselect`** – Uses `selectlib.quickselect` for median selection.
67+
4. **`heapselect`** – Uses `selectlib.heapselect` for median selection.
6468

65-
For each list size (ranging from 1,000 to 1,000,000 elements) and for several values of k (0.2%, 1%, 10%, and 25% of N), each method is executed five times, and the median runtime is recorded. The benchmark results are then visualized as grouped bar charts. You can view an example plot below:
69+
For each list size (from 1,000 to 1,000,000 elements), the script runs 5 iterations and records the median runtime. The performance results are then plotted as a grouped bar chart, with each group corresponding to a different list size.
6670

67-
![Benchmark Results](https://github.com/grantjenks/python-selectlib/blob/main/plot.png?raw=true)
71+
![Median Benchmark Results](https://github.com/grantjenks/python-selectlib/blob/main/plot_median.png?raw=true)
72+
73+
To run the median benchmark, execute:
74+
75+
```bash
76+
python benchmark_median.py
77+
```
78+
79+
## K-Smallest Benchmarking
6880

69-
To run the benchmarks, simply execute:
81+
Selectlib comes with a benchmark script named `benchmark.py` that compares the following five methods to obtain the K smallest items from a list:
82+
83+
1. **`sort`** – Creates a sorted copy of the list and slices the first k elements.
84+
2. **`heapq.nsmallest`** – Uses Python’s standard library heap algorithm.
85+
3. **`quickselect`** – Partitions using `selectlib.quickselect`, then slices and sorts the first k elements.
86+
4. **`heapselect`** – Partitions using `selectlib.heapselect`, then slices and sorts the first k elements.
87+
5. **`nth_element`** – Partitions using `selectlib.nth_element`, then slices and sorts the first k elements.
88+
89+
For each list size (ranging from 1,000 to 1,000,000 elements) and for several values of k (0.2%, 1%, 10%, and 25% of N), each method is executed five times, and the median runtime is recorded. The benchmark results are then visualized as grouped bar charts.
90+
91+
![Benchmark Results](https://github.com/grantjenks/python-selectlib/blob/main/plot.png?raw=true)
7092

71-
python benchmark.py
93+
To run the benchmark, execute:
7294

73-
This will generate the plot and display performance comparisons across the five methods.
95+
```bash
96+
python benchmark.py
97+
```
7498

75-
Development & Continuous Integration
76-
--------------------------------------
99+
## Development & Continuous Integration
77100

78-
Before installing locally, make sure you have a C compiler and the Python development headers installed for your platform.
101+
Before installing locally, ensure you have a C compiler and the Python development headers installed for your platform.
79102

80-
1. Clone the repository:
103+
1. **Clone the repository:**
81104

105+
```bash
82106
git clone https://github.com/grantjenks/python-selectlib.git
83107
cd python-selectlib
108+
```
84109

85-
2. Build and install in editable mode:
110+
2. **Build and install in editable mode:**
86111

112+
```bash
87113
python -m pip install -e .
114+
```
88115

89116
This project uses GitHub Actions for CI/CD. The available workflows cover:
90117

91-
release.yml – Builds wheels for multiple platforms and publishes packages to PyPI.
92-
test.yml – Runs automated tests and linting on multiple Python versions.
118+
- **release.yml** – Builds wheels for multiple platforms and publishes packages to PyPI.
119+
- **test.yml** – Runs automated tests and linting on multiple Python versions.
93120

94-
License
95-
-------
121+
## License
96122

97-
selectlib is licensed under the Apache License, Version 2.0. See the LICENSE file for full details.
123+
selectlib is licensed under the Apache License, Version 2.0. See the [LICENSE](LICENSE) file for full details.

0 commit comments

Comments
 (0)