Open
Description
Currently, memory usage of the prime sieve increases linearly with the size of the interval [lo, hi]
, while the number of primes only grows asymptotically as (hi) / log (hi)
. Furthermore, L1 memory caching is not exploited.
By implementing a segmented sieve both problems can be addressed at once: lowering memory consumption and increasing speed (in C++ I noticed up to 10x 5x better performance).
The current algorithm is basically:
- Discover prime numbers
p
less than√hi
- Cross off multiples of
p
starting atp * p
. - Collect all prime numbers in
[√hi, hi]
The suggested procedure is:
- Discover all prime numbers
p
less than√hi
- Divide
[√hi, hi]
into fixed size segments / smaller intervals (fitting L1 cache) - Outer loop: iterate over each segment
- Inner loop: cross off all multiples of all primes in the current segment
- Collect all primes in the current segment
You only need to store one segment (constant memory allocation) in step 2/3 and if the segment fits in L1 cache it is extremely fast, since you're iterating over it multiple times when sieving.
For reference see http://primesieve.org/
Metadata
Metadata
Assignees
Labels
No labels