-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* [Sieve]: Draft approaches * fixes various typos and random gibberish * Update introduction.md * Update exercises/practice/sieve/.approaches/comprehensions/content.md Co-authored-by: BethanyG <BethanyG@users.noreply.github.com> * Update exercises/practice/sieve/.approaches/comprehensions/content.md Co-authored-by: BethanyG <BethanyG@users.noreply.github.com> * Update exercises/practice/sieve/.approaches/comprehensions/content.md Co-authored-by: BethanyG <BethanyG@users.noreply.github.com> * Update exercises/practice/sieve/.approaches/comprehensions/content.md Co-authored-by: BethanyG <BethanyG@users.noreply.github.com> * Update exercises/practice/sieve/.approaches/nested-loops/content.md Co-authored-by: BethanyG <BethanyG@users.noreply.github.com> * Update exercises/practice/sieve/.approaches/comprehensions/content.md Co-authored-by: BethanyG <BethanyG@users.noreply.github.com> * Update exercises/practice/sieve/.approaches/comprehensions/snippet.txt Co-authored-by: BethanyG <BethanyG@users.noreply.github.com> * Update exercises/practice/sieve/.approaches/introduction.md Co-authored-by: BethanyG <BethanyG@users.noreply.github.com> * Update exercises/practice/sieve/.approaches/nested-loops/content.md Co-authored-by: BethanyG <BethanyG@users.noreply.github.com> * Update exercises/practice/sieve/.approaches/nested-loops/snippet.txt Co-authored-by: BethanyG <BethanyG@users.noreply.github.com> * Update exercises/practice/sieve/.approaches/comprehensions/content.md Does this add a spurious extra space after the link? Co-authored-by: BethanyG <BethanyG@users.noreply.github.com> * Removed graph from content.md To save us forgetting it later. * Delete timeit_bar_plot.svg I didn't intend to commit this in the first place. * removed space from content.md * Update exercises/practice/sieve/.approaches/nested-loops/content.md * Update exercises/practice/sieve/.approaches/nested-loops/content.md * Update exercises/practice/sieve/.approaches/introduction.md * Update exercises/practice/sieve/.approaches/introduction.md * Update exercises/practice/sieve/.approaches/introduction.md * Code Block Corrections Somehow, the closing of the codeblocks got dropped. Added them back in, along with final typo corrections. --------- Co-authored-by: BethanyG <BethanyG@users.noreply.github.com>
- Loading branch information
1 parent
5dd5af1
commit 7e3a633
Showing
17 changed files
with
2,173 additions
and
0 deletions.
There are no files selected for viewing
36 changes: 36 additions & 0 deletions
36
exercises/practice/sieve/.approaches/comprehensions/content.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
# Comprehensions | ||
|
||
```python | ||
def primes(number): | ||
prime = (item for item in range(2, number+1) | ||
if item not in (not_prime for item in range(2, number+1) | ||
for not_prime in range(item*item, number+1, item))) | ||
return list(prime) | ||
``` | ||
|
||
Many of the solutions to Sieve use `comprehensions` or `generator-expressions` at some point, but this page is about examples that put almost *everything* into a single, elaborate `generator-expression` or `comprehension`. | ||
|
||
The above example uses a `generator-expression` to do all the calculation. | ||
|
||
There are at least two problems with this: | ||
- Readability is poor. | ||
- Performance is exceptionally bad, making this the slowest solution tested, for all input sizes. | ||
|
||
Notice the many `for` clauses in the generator. | ||
|
||
This makes the code similar to [nested loops][nested-loops], and run time scales quadratically with the size of `number`. | ||
In fact, when this code is compiled, it _compiles to nested loops_ that have the additional overhead of generator setup and tracking. | ||
|
||
```python | ||
def primes(limit): | ||
return [number for number in range(2, limit + 1) | ||
if all(number % divisor != 0 for divisor in range(2, number))] | ||
``` | ||
|
||
This second example using a `list-comprehension` with `all()` is certainly concise and _relatively_ readable, but the performance is again quite poor. | ||
|
||
This is not quite a fully nested loop (_there is a short-circuit when `all()` evaluates to `False`_), but it is by no means "performant". | ||
In this case, scaling with input size is intermediate between linear and quadratic, so not quite as bad as the first example. | ||
|
||
|
||
[nested-loops]: https://exercism.org/tracks/python/exercises/sieve/approaches/nested-loops |
3 changes: 3 additions & 0 deletions
3
exercises/practice/sieve/.approaches/comprehensions/snippet.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
def primes(limit): | ||
return [number for number in range(2, limit + 1) if | ||
all(number % divisor != 0 for divisor in range(2, number))] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
{ | ||
"introduction": { | ||
"authors": [ | ||
"colinleach", | ||
"BethanyG" | ||
] | ||
}, | ||
"approaches": [ | ||
{ | ||
"uuid": "85752386-a3e0-4ba5-aca7-22f5909c8cb1", | ||
"slug": "nested-loops", | ||
"title": "Nested Loops", | ||
"blurb": "Relativevly clear solutions with explicit loops.", | ||
"authors": [ | ||
"colinleach", | ||
"BethanyG" | ||
] | ||
}, | ||
{ | ||
"uuid": "04701848-31bf-4799-8093-5d3542372a2d", | ||
"slug": "set-operations", | ||
"title": "Set Operations", | ||
"blurb": "Performance enhancements with Python sets.", | ||
"authors": [ | ||
"colinleach", | ||
"BethanyG" | ||
] | ||
}, | ||
{ | ||
"uuid": "183c47e3-79b4-4afb-8dc4-0deaf094ce5b", | ||
"slug": "comprehensions", | ||
"title": "Comprehensions", | ||
"blurb": "Ultra-concise code and its downsides.", | ||
"authors": [ | ||
"colinleach", | ||
"BethanyG" | ||
] | ||
} | ||
] | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
# Introduction | ||
|
||
The key to this exercise is to keep track of: | ||
- A list of numbers. | ||
- Their status of possibly being prime. | ||
|
||
|
||
## General Guidance | ||
|
||
To solve this exercise, it is necessary to choose one or more appropriate data structures to store numbers and status, then decide the best way to scan through them. | ||
|
||
There are many ways to implement the code, and the three broad approaches listed below are not sharply separated. | ||
|
||
|
||
## Approach: Using nested loops | ||
|
||
```python | ||
def primes(number): | ||
not_prime = [] | ||
prime = [] | ||
|
||
for item in range(2, number+1): | ||
if item not in not_prime: | ||
prime.append(item) | ||
for element in range(item*item, number+1, item): | ||
not_prime.append(element) | ||
|
||
return prime | ||
``` | ||
|
||
The theme here is nested, explicit `for` loops to move through ranges, testing validity as we go. | ||
|
||
For details and another example see [`nested-loops`][approaches-nested]. | ||
|
||
|
||
## Approach: Using set operations | ||
|
||
```python | ||
def primes(number): | ||
not_prime = set() | ||
primes = [] | ||
|
||
for num in range(2, number+1): | ||
if num not in not_prime: | ||
primes.append(num) | ||
not_prime.update(range (num*num, number+1, num)) | ||
|
||
return primes | ||
``` | ||
|
||
In this group, the code uses the special features of the Python [`set`][sets] to improve efficiency. | ||
|
||
For details and other examples see [`set-operations`][approaches-sets]. | ||
|
||
|
||
## Approach: Using complex or nested comprehensions | ||
|
||
|
||
```python | ||
def primes(limit): | ||
return [number for number in range(2, limit + 1) if | ||
all(number % divisor != 0 for divisor in range(2, number))] | ||
``` | ||
|
||
Here, the emphasis is on implementing a solution in the minimum number of lines, even at the expense of readability or performance. | ||
|
||
For details and another example see [`comprehensions`][approaches-comps]. | ||
|
||
|
||
## Using packages outside base Python | ||
|
||
|
||
In statically typed languages, common approaches include bit arrays and arrays of booleans. | ||
|
||
Neither of these is a natural fit for core Python, but there are external packages that could perhaps provide a better implementation: | ||
|
||
- For bit arrays, there is the [`bitarray`][bitarray] package and [`bitstring.BitArray()`][bitstring]. | ||
- For arrays of booleans, we could use the NumPy package: `np.ones((number,), dtype=np.bool_)` will create a pre-dimensioned array of `True`. | ||
|
||
It should be stressed that these will not work in the Exercism test runner, and are mentioned here only for completeness. | ||
|
||
## Which Approach to Use? | ||
|
||
|
||
This exercise is for learning, and is not directly relevant to production code. | ||
|
||
The point is to find a solution which is correct, readable, and remains reasonably fast for larger input values. | ||
|
||
The "set operations" example above is clean, readable, and in benchmarking was the fastest code tested. | ||
|
||
Further details of performance testing are given in the [Performance article][article-performance]. | ||
|
||
[approaches-nested]: https://exercism.org/tracks/python/exercises/sieve/approaches/nested-loops | ||
[approaches-sets]: https://exercism.org/tracks/python/exercises/sieve/approaches/set-operations | ||
[approaches-comps]: https://exercism.org/tracks/python/exercises/sieve/approaches/comprehensions | ||
[article-performance]:https://exercism.org/tracks/python/exercises/sieve/articles/performance | ||
[sets]: https://docs.python.org/3/library/stdtypes.html#set-types-set-frozenset | ||
[bitarray]: https://pypi.org/project/bitarray/ | ||
[bitstring]: https://bitstring.readthedocs.io/en/latest/ |
49 changes: 49 additions & 0 deletions
49
exercises/practice/sieve/.approaches/nested-loops/content.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
# Nested Loops | ||
|
||
|
||
```python | ||
def primes(number): | ||
not_prime = [] | ||
prime = [] | ||
|
||
for item in range(2, number+1): | ||
if item not in not_prime: | ||
prime.append(item) | ||
for element in range (item*item, number+1, item): | ||
not_prime.append(element) | ||
|
||
return prime | ||
``` | ||
|
||
This is the type of code that many people might write as a first attempt. | ||
|
||
It is very readable and passes the tests. | ||
|
||
The clear disadvantage is that run time is quadratic in the input size: `O(n**2)`, so this approach scales poorly to large input values. | ||
|
||
Part of the problem is the line `if item not in not_prime`, where `not-prime` is a list that may be long and unsorted. | ||
|
||
This operation requires searching the entire list, so run time is linear in list length: not ideal within a loop repeated many times. | ||
|
||
```python | ||
def primes(number): | ||
number += 1 | ||
prime = [True for item in range(number)] | ||
for index in range(2, number): | ||
if not prime[index]: | ||
continue | ||
for candidate in range(2 * index, number, index): | ||
prime[candidate] = False | ||
return [index for index, value in enumerate(prime) if index > 1 and value] | ||
``` | ||
|
||
|
||
At first sight, this second example looks quite similar to the first. | ||
|
||
However, on testing it performs much better, scaling linearly with `number` rather than quadratically. | ||
|
||
A key difference is that list entries are tested by index: `if not prime[index]`. | ||
|
||
This is a constant-time operation independent of the list length. | ||
|
||
Relatively few programmers would have predicted such a major difference just by looking at the code, so if performance matters we should always test, not guess. |
8 changes: 8 additions & 0 deletions
8
exercises/practice/sieve/.approaches/nested-loops/snippet.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
def primes(number): | ||
number += 1 | ||
prime = [True for item in range(number)] | ||
for index in range(2, number): | ||
if not prime[index]: continue | ||
for candidate in range(2 * index, number, index): | ||
prime[candidate] = False | ||
return [index for index, value in enumerate(prime) if index > 1 and value] |
69 changes: 69 additions & 0 deletions
69
exercises/practice/sieve/.approaches/set-operations/content.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
# Set Operations | ||
|
||
|
||
```python | ||
def primes(number): | ||
not_prime = set() | ||
primes = [] | ||
|
||
for num in range(2, number+1): | ||
if num not in not_prime: | ||
primes.append(num) | ||
not_prime.update(range(num*num, number+1, num)) | ||
|
||
return primes | ||
``` | ||
|
||
|
||
This is the fastest method so far tested, at all input sizes. | ||
|
||
With only a single loop, performance scales linearly: `O(n)`. | ||
|
||
A key step is the set `update()`. | ||
|
||
Less commonly seen than `add()`, which takes single element, `update()` takes any iterator of hashable values as its parameter and efficiently adds all the elements in a single operation. | ||
|
||
In this case, the iterator is a range resolving to all multiples, up to the limit, of the prime we just found. | ||
|
||
Primes are collected in a list, in ascending order, so there is no need for a separate sort operation at the end. | ||
|
||
|
||
```python | ||
def primes(number): | ||
numbers = set(item for item in range(2, number+1)) | ||
|
||
not_prime = set(not_prime for item in range(2, number+1) | ||
for not_prime in range(item**2, number+1, item)) | ||
|
||
return sorted(list((numbers - not_prime))) | ||
``` | ||
|
||
After a set comprehension in place of an explicit loop, the second example uses set-subtraction as a key feature in the return statement. | ||
|
||
The resulting set needs to be converted to a list then sorted, which adds some overhead, [scaling as O(n *log* n)][sort-performance]. | ||
|
||
In performance testing, this code is about 4x slower than the first example, but still scales as `O(n)`. | ||
|
||
|
||
```python | ||
def primes(number: int) -> list[int]: | ||
start = set(range(2, number + 1)) | ||
return sorted(start - {m for n in start for m in range(2 * n, number + 1, n)}) | ||
``` | ||
|
||
The third example is quite similar to the second, just moving the comprehension into the return statement. | ||
|
||
Performance is very similar between examples 2 and 3 at all input values. | ||
|
||
|
||
## Sets: strengths and weaknesses | ||
|
||
Sets offer two main benefits which can be useful in this exercise: | ||
- Entries are guaranteed to be unique. | ||
- Determining whether the set contains a given value is a fast, constant-time operation. | ||
|
||
Less positively: | ||
- The exercise specification requires a list to be returned, which may involve a conversion. | ||
- Sets have no guaranteed ordering, so two of the above examples incur the time penalty of sorting a list at the end. | ||
|
||
[sort-performance]: https://en.wikipedia.org/wiki/Timsort |
8 changes: 8 additions & 0 deletions
8
exercises/practice/sieve/.approaches/set-operations/snippet.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
def primes(number): | ||
not_prime = set() | ||
primes = [] | ||
for num in range(2, number+1): | ||
if num not in not_prime: | ||
primes.append(num) | ||
not_prime.update(range(num*num, number+1, num)) | ||
return primes |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
{ | ||
"articles": [ | ||
{ | ||
"slug": "performance", | ||
"uuid": "fdbee56a-b4db-4776-8aab-3f7788c612aa", | ||
"title": "Performance deep dive", | ||
"authors": [ | ||
"BethanyG", | ||
"colinleach" | ||
], | ||
"blurb": "Results and analysis of timing tests for the various approaches." | ||
} | ||
] | ||
} |
Oops, something went wrong.