exercism · BethanyG · Feb 12, 2024 · Feb 7, 2024 · Feb 7, 2024 · Feb 7, 2024
diff --git a/exercises/practice/sieve/.approaches/comprehensions/content.md b/exercises/practice/sieve/.approaches/comprehensions/content.md
@@ -0,0 +1,35 @@
+# Comprehensions
+
+```python
+def primes(number):
+    prime = (item for item in range(2, number+1) 
+              if item not in (not_prime for item in range(2, number+1) 
+              for not_prime in range(item*item, number+1, item)))
+    return list(prime)
+```
+
+Many of the solutions to Sieve use `comprehensions` or `generator-expressions` at some point, but this page is about examples that put almost *everything* into a single, elaborate `generator-expression` or `comprehension`.
+
+The above example uses a `generator-expression` to do all the calculation.
+
+There are at least two problems with this:
+- Readability is poor.
+- Performance is exceptionally bad, making this the slowest solution tested, for all input sizes.
+
+Notice the many `for` clauses in the generator.
+
+This makes the code similar to [nested loops][nested-loops], and run time scales quadratically with the size of `number`.
+In fact, when this code is compiled, it _compiles to nested loops_ that have the additional overhead of generator setup and tracking.
+
+```python
+def primes(limit):
+    return [number for number in range(2, limit + 1)
+            if all(number % divisor != 0 for divisor in range(2, number))]
+
+This second example using a `list-comprehension` with `all()` is certainly concise and _relatively_ readable, but the performance is again quite poor.
+
+This is not quite a fully nested loop (_there is a short-circuit when `all()` evaluates to `False`_), but it is by no means "performant".
+In this case, scaling with input size is intermediate between linear and quadratic, so not quite as bad as the first example.
+
+
+[nested-loops]: https://exercism.org/tracks/python/exercises/sieve/approaches/nested-loops
diff --git a/exercises/practice/sieve/.approaches/comprehensions/snippet.txt b/exercises/practice/sieve/.approaches/comprehensions/snippet.txt
@@ -0,0 +1,3 @@
+def primes(limit):
+    return [number for number in range(2, limit + 1) if 
+                 all(number % divisor != 0 for divisor in range(2, number))]
diff --git a/exercises/practice/sieve/.approaches/config.json b/exercises/practice/sieve/.approaches/config.json
@@ -0,0 +1,40 @@
+{
+  "introduction": {
+    "authors": [
+      "colinleach",
+      "BethanyG"
+    ]
+  },
+  "approaches": [
+    {
+      "uuid": "85752386-a3e0-4ba5-aca7-22f5909c8cb1",
+      "slug": "nested-loops",
+      "title": "Nested Loops",
+      "blurb": "Relativevly clear solutions with explicit loops.",
+      "authors": [
+        "colinleach",
+        "BethanyG"
+      ]
+    },
+    {
+      "uuid": "04701848-31bf-4799-8093-5d3542372a2d",
+      "slug": "set-operations",
+      "title": "Set Operations",
+      "blurb": "Performance enhancements with Python sets.",
+      "authors": [
+        "colinleach",
+        "BethanyG"
+      ]
+    },
+    {
+      "uuid": "183c47e3-79b4-4afb-8dc4-0deaf094ce5b",
+      "slug": "comprehensions",
+      "title": "Comprehensions",
+      "blurb": "Ultra-concise code and its downsides.",
+      "authors": [
+        "colinleach",
+        "BethanyG"
+      ]
+    }
+  ]
+}
diff --git a/exercises/practice/sieve/.approaches/introduction.md b/exercises/practice/sieve/.approaches/introduction.md
@@ -0,0 +1,80 @@
+# Introduction
+
+The key to this exercise is to keep track of:
+- A list of numbers.
+- Their status of possibly being prime.
+
+## General Guidance
+
+To solve this exercise, it is necessary to choose one or more appropriate data structures to store numbers and status, then decide the best way to scan through them.
+
+There are many ways to implement the code, and the three broad approaches listed below are not sharply separated.
+
+## Approach: Using nested loops
+
+```python
+def primes(number):
+    not_prime = []
+    prime = []
+
+    for item in range(2, number+1):
+        if item not in not_prime:
+            prime.append(item) 
+            for element in range(item*item, number+1, item):
+                not_prime.append(element)
+
+    return prime
+```
+
+The theme here is nested, explicit `for` loops to move through ranges, testing validity as we go.
+
+For details and another example see [`nested-loops`][approaches-nested].
+
+## Approach: Using set operations
+
+```python
+def primes(number):
+    not_prime = set()
+    primes = []
+
+    for num in range(2, number+1):
+        if num not in not_prime:
+            primes.append(num)
+            not_prime.update(range (num*num, number+1, num))
+
+    return primes
+```
+
+In this group, the code uses the special features of the Python [`set`][sets] to improve efficiency.
+
+For details and other examples see [`set-operations`][approaches-sets].
+
+## Approach: Using complex or nested comprehensions
+
+```python
+def primes(limit):
+    return [number for number in range(2, limit + 1) if 
+                 all(number % divisor != 0 for divisor in range(2, number))]
+- For bit arrays, there is the [`bitarray`][bitarray] package and [`bitstring.BitArray()`][bitstring].
+- For arrays of booleans, we could use the NumPy package: `np.ones((number,), dtype=np.bool_)` will create a pre-dimensioned array of `True`.
+
+It should be stressed that these will not work in the Exercism test runner, and are mentioned here only for completeness.
+
+
+## Which Approach to Use?
+
+This exercise is for learning, and is not directly relevant to production code.
+
+The point is to find a solution which is correct, readable, and remains reasonably fast for larger input values.
+
+The "set operations" example above is clean, readable, and in benchmarking was the fastest code tested.
+
+Further details of perfomance testing are given in the [Performance article][article-performance].
+
+[approaches-nested]: https://exercism.org/tracks/python/exercises/sieve/approaches/nested-loops
+[approaches-sets]: https://exercism.org/tracks/python/exercises/sieve/approaches/set-operations
+[approaches-comps]: https://exercism.org/tracks/python/exercises/sieve/approaches/comprehensions
+[article-performance]:https://exercism.org/tracks/python/exercises/sieve/articles/performance
+[sets]: https://docs.python.org/3/library/stdtypes.html#set-types-set-frozenset
+[bitarray]: https://pypi.org/project/bitarray/
+[bitstring]: https://bitstring.readthedocs.io/en/latest/
diff --git a/exercises/practice/sieve/.approaches/nested-loops/content.md b/exercises/practice/sieve/.approaches/nested-loops/content.md
@@ -0,0 +1,37 @@
+# Nested Loops
+
+```python
+def primes(number):
+    not_prime = []
+    prime = []
+
+    for item in range(2, number+1):
+        if item not in not_prime:
+            prime.append(item) 
+            for element in range (item*item, number+1, item):
+                not_prime.append(element)
+
+    return prime
+```
+
+This is the type of code that many people might write as a first attempt.
+
+It is very readable and passes the tests.
+
+The clear disadvantage is that run time is quadratic in the input size: `O(n**2)`, so this approach scales poorly to large input values.
+
+Part of the problem is the line `if item not in not_prime`, where `not-prime` is a list that may be long and unsorted.
+
+This operation requires searching the entire list, so run time is linear in list length: not ideal within a loop repeated many times.
+
+```python
+def primes(number):
+    number += 1
+    prime = [True for item in range(number)]
+    for index in range(2, number):
+        if not prime[index]:
+            continue
+        for candidate in range(2 * index, number, index):
+            prime[candidate] = False
+    return [index for index, value in enumerate(prime) if index > 1 and value]
+Relatively few programmers would have predicted such a major difference just by looking at the code, so if performance matters we should always test, not guess.
diff --git a/exercises/practice/sieve/.approaches/nested-loops/snippet.txt b/exercises/practice/sieve/.approaches/nested-loops/snippet.txt
@@ -0,0 +1,8 @@
+def primes(number):
+    number += 1
+    prime = [True for item in range(number)]
+    for index in range(2, number):
+        if not prime[index]: continue
+        for candidate in range(2 * index, number, index):
+            prime[candidate] = False
+    return [index for index, value in enumerate(prime) if index > 1 and value]
diff --git a/exercises/practice/sieve/.approaches/set-operations/content.md b/exercises/practice/sieve/.approaches/set-operations/content.md
@@ -0,0 +1,65 @@
+# Set Operations
+
+```python
+def primes(number):
+    not_prime = set()
+    primes = []
+
+    for num in range(2, number+1):
+        if num not in not_prime:
+            primes.append(num)
+            not_prime.update(range(num*num, number+1, num))
+
+    return primes
+```
+
+This is the fastest method so far tested, at all input sizes.
+
+With only a single loop, performance scales linearly: O(n).
+
+A key step is the set `update()`.
+
+Less commonly seen than `add()`, which takes single element, `update()` takes any iterator of hashable values as its parameter and efficiently adds all the elements in a single operation.
+
+In this case, the iterator is a range resolving to all multiples, up to the limit, of the prime we just found.
+
+Primes are collected in a list, in ascending order, so there is no need for a separate sort operation at the end.
+
+
+```python
+def primes(number):
+    numbers = set(item for item in range(2, number+1))
+
+    not_prime = set(not_prime for item in range(2, number+1)
+                    for not_prime in range(item**2, number+1, item))
+
+    return  sorted(list((numbers - not_prime)))
+```
+
+After a set comprehension in place of an explicit loop, the second example uses set-subtraction as a key feature in the return statement. 
+
+The resulting set needs to be converted to a list then sorted, which adds some overhead, [scaling as O(n *log* n)][sort-performance].
+
+In performance testing, this code is about 4x slower than the the first example, but still scales as O(n).
+
+```python
+def primes(number: int) -> list[int]:
+    start = set(range(2, number + 1))
+    return sorted(start - {m for n in start for m in range(2 * n, number + 1, n)})
+```
+
+The third example is quite similar to the second, just moving the comprehension into the return statement.
+
+Performance is very similar between examples 2 and 3 at all input values.
+
+## Sets: strengths and weaknesses
+
+Sets offer two main benefits which can be useful in this exercise:
+- Entries are guaranteed to be unique.
+- Determining whether the set contains a given value is a fast, constant-time operation.
+
+Less positively:
+- The exercise specification requires a list to be returned, which may involve a conversion.
+- Sets have no guaranteed ordering, so two of the above examples incur the time penalty of sorting a list at the end.
+
+[sort-performance]: https://en.wikipedia.org/wiki/Timsort
diff --git a/exercises/practice/sieve/.approaches/set-operations/snippet.txt b/exercises/practice/sieve/.approaches/set-operations/snippet.txt
@@ -0,0 +1,8 @@
+def primes(number):
+    not_prime = set()
+    primes = []
+    for num in range(2, number+1):
+        if num not in not_prime:
+            primes.append(num)
+            not_prime.update(range(num*num, number+1, num))
+    return primes
diff --git a/exercises/practice/sieve/.articles/config.json b/exercises/practice/sieve/.articles/config.json
@@ -0,0 +1,14 @@
+{
+  "articles": [
+    {
+      "slug": "performance",
+      "uuid": "fdbee56a-b4db-4776-8aab-3f7788c612aa",
+      "title": "Performance deep dive",
+      "authors": [
+        "BethanyG",
+        "colinleach"
+      ],
+      "blurb": "Results and analysis of timing tests for the various approaches."
+    }
+  ]
+}