- Related functions and programs:
- NBR_UTIL -
prime number and integer square root functions; e.g.,
`nbrIsPrime()`

and`nbrNextPrime()`

. **primal**- prime number algorithms test program.

"We don't do this song very much, but I feel like singing so, I guess, that's what we're gonna do."

- Duane Allman, "Dimples", Ludlow Garage, 1970 (Wikipedia)

I feel like writing so, I guess, that's what I'm going to do!

**WARNING:** All over the internet, you'll find example code
that determines if a number is prime. Nearly all the examples I've seen
loop through divisors while "*i*×*i* ≤ *n*"
(i.e., while the square of the next divisor to be tested is less than
or equal to the number). Squaring the divisor is fine if you're using
arbitrary-precision arithmetic, but it can fail at the high end if you're
using fixed-size integers. Take unsigned 32-bit numbers for example:

~65535.9^2 = 4,294,967,295 (maximum unsigned 32-bit number) 65535.0^2 = 4,294,836,225

There are 5,853 prime numbers between 65535.0^{2} and
~65535.9^{2}. When checking the next divisor after 65,535
for any of these primes, the *i*×*i* computation will
overflow, leading to meaningless, possibly infinite loops. This is
discussed in more detail in the "Limit
Checking" section below.

**NOTE:** This prime number package is *GROSSLY
OVERENGINEERED*. I began with an optimization that skipped
testing potential divisors divisible by 2 or by 3 in a small, tight
loop. I then got to wondering about skipping potential divisors
divisible by an even wider range of prime divisors; e.g., skip
divisors divisible by 2, 3, 5, 7, or 11. I generalized the code
to handle an arbitrary range, [2..

**NOTE:** I was *REINVENTING THE WHEEL*.
I basically had and have a grade-school knowledge of prime numbers.
After writing the code, I looked up on the internet fast ways of
determining primes and I learned that "my" optimization was widely
known (see the mathematics section below). So perhaps this was all
a waste of time, but the open-ended exploration of the subject on
my own was very intellectually satisfying for me.

**NOTE:** If a *SMALL MEMORY FOOTPRINT*
is important, this package can be compiled to produce compact
object code that uses the original hard-coded loop and omits
the sequence-of-differences functions and data related to the
generalized version of the code. To accomplish this, add

-DNBR_SKIP_2_3_HARDCODED=1

to the compilation command for "nbr_util.c". If an application only
calls `nbrIsPrime()`

and/or `nbrNextPrime()`

,
its code does *not* need to be compiled with the CPP definition.
If, however, the application calls any of the sequence-of-differences
functions, the application should be compiled with the CPP definition.
Macros in "nbr_util.h" substitute meaningful return values for the
omitted functions.

**NOTE:** To jog your memory, if necessary,
a *NUMBER IS PRIME* if its only divisors are 1 and itself;
zero and one by definition are not prime. To determine if a number
*n* is prime, you basically test if the number is evenly divisible
(remainder 0) by any divisor in the range
2..√*n*.
If the number is evenly divisible by some divisor, then the number
is not prime. If none of the divisors evenly divide the number,
then the number is prime.

I wrote my HASH_UTIL
hash table package in the late 1980s, as I recall.
It has long been a nuisance that a program using the package has had
to be explicitly linked to the C math library ("-lm"). So, about 30
years later, I decided to fix that! Thinking that the math library
was required because of the use of `sqrt(3)`

in *(i)*
determining the next prime number larger than the expected table size and
*(ii)* computing the standard deviation of the bucket lengths,
I dove into the code. My memory hadn't served me well, because I
discovered that the prime number function was not using `sqrt()`

,
but was instead iterating until:

divisor > (number / divisor)

I don't know if I was considering the possibility of overflow at the time I wrote the hash table package, but it was a fortuitous choice for the loop test.

So far, so good, but I was red-faced to see that for 3 decades my code
had been checking each and every possible divisor from 2 on up: 2, 3,
4, 5, 6, 7, ... Skipping all the even numbers after 2 was a painfully
obvious optimization. And then I began thinking about also skipping
odd numbers divisible by 3: check divisors 5, 7, *... skip 9 ...*, 11,
13, *... skip 15 ...*, 17, 19, etc. I noticed the differences between
divisors to be tested was 2, 4, 2, 4, etc. In other words, a repeating
cycle of two divisors. I implemented a hard-coded loop ("divisible"
means evenly divisible with a remainder of 0):

if (number equals 0 or 1) return (not-prime) ; if (number equals 2 or 3) return (is-prime) ; if (number divisible by 2 or by 3) return (not-prime) ; for (divisor = 5 to "square root" of number) { if (number divisible by divisor) return (not-prime) ; divisor += 2 ; if (number divisible by divisor) return (not-prime) ; divisor += 4 ; } return (is-prime) ;

which checks the following divisors: [2, 3,] 5, 7, 11, 13, 17, 19, 23, 25, 29, 31, 35, 37, ... Skipping odd divisors divisible by 3 is almost twice as fast as checking all of the odd numbers.

I was pleased by my independent "discovery" of this heretofore unthought-of optimization (/snark) and my thoughts naturally turned to skipping even more divisors.

First, however, some terminology I use in this package:

*Skip-Through Prime (STP)* - is the maximum prime number for
which divisors divisible by one or more of primes 2 through STP
are skipped. In the pseudocode above, the skip-through prime
is 3. As another example, if the skip-through prime is 11,
`nbrIsPrime()`

will skip divisors divisible by any
of the primes 2 through 11: 2, 3, 5, 7, 11.

*Start Divisor* - is the divisor at which to begin the first
cycle of testing. This is always the next prime after the
skip-through prime. Using the examples above, STP 3's start
divisor is 5 and STP 11's start divisor is 13.

*Sequence of Differences* - is an array of the differences
between successive divisors to be tested. The differences skip
over divisors that do not need to be tested; i.e., divisors evenly
divisible by primes 2 through the skip-through prime.

Beginning with the start divisor, `nbrIsPrime()`

tests if
the target number is evenly divisible by the start divisor. If so,
the number is not prime and `nbrIsPrime()`

returns. If
not, the first difference in the sequence is added to the start
divisor to get the next divisor to test. The function continues
testing successive divisors; when the end of the sequence is reached,
`nbrIsPrime()`

jumps back to the beginning of the sequence
to continue with the next pass through the sequence.

For example, in the pseudocode above, the sequence of differences for STP 3 is { 2, 4 }. Beginning with start divisor 5, the loop iterations unfold as follows:

(1) Test 5, Add 2, Test 7, Add 4 (2) Test 11, Add 2, Test 13, Add 4 (3) Test 17, Add 2, Test 19, Add 4 (4) Test 23, Add 2, Test 25, Add 4 (5) Test 29, ...

*Limit Check* - refers to checking if a divisor about to be tested
is less than or equal to the square root of the target number.
When testing all odd divisors (skip-through prime 2), there is
only one test in the loop and the limit check is performed as
part of the loop control. For skip-through primes 3 and above,
there are multiple tests per loop iteration, corresponding to
each difference in the prime's sequence. STP 5 has a sequence
of 8 differences, for example, but adding 7 limit checks inside
the loop incurs a performance hit. (The pseudocode above doesn't
limit check the second test's divisor in the loop.) I discuss
this issue in more detail below, "Limit Checking".

The pseudocode for the generalized implementation inserts an inner loop that steps through the sequence of differences (again, "divisible" means evenly divisible with a remainder of 0):

if (number equals 0 or 1) return (not-prime) ; if (number equals any prime in 2..STP) return (is-prime) ; if (number divisible by any prime in 2..STP) return (not-prime) ; divisor = start divisor ; while (divisor ≤ "square root" of number) { for (each difference in sequence) { if (number divisible by divisor) return (not-prime) ; divisor += difference ; } } return (is-prime) ;

Figuring out the sequence of differences for skip-through prime 3 was easy enough to do in my head. I wrote a throw-away program to generate differences for skip-through primes 5, 7, and 11; examined the listings to determine when the sequences began repeating (8, 48, and 480, respectively); and noticed the first of a few relations I would find:

seq-length (next-prime) = seq-length (current-prime) × (next-prime - 1)

For example:

Skip-through prime 2's sequence of differences, { 2 }, is of length 1. STP 3 has a sequence of length 2 = 1 × (3 - 1). STP 5 has a sequence of length 8 = 2 × (5 - 1). STP 7 has a sequence of length 48 = 8 × (7 - 1). STP 11 has a sequence of length 480 = 48 × (11 - 1). And so on.

Sequence length, for the skip-through primes under consideration (2 through 53 for 64-bit unsigned long integers), increases by one or two orders of magnitude for each new prime. STP 59's sequence length exceeds the maximum representable, unsigned 64-bit value.

range (next-prime) = range (current-prime) × next-prime

The range of a sequence of differences is the range of divisors covered by each pass over the sequence, found by summing all the differences in the sequence. STP 3's sequence is { 2, 4 }, whose range is 2 + 4 = 6; as an example above shows, the first divisor tested in each loop iteration is 5, 11, 17, 23, 29, ... STP 5's 8-element sequence has a range of 30, so the divisors at the beginning of each loop iteration are 7, 37, 67, 97, ... I wasn't clever enough to realize that the formula above is more simply expressed as the product of the preceding primes and the skip-through prime; e.g., the range of STP 7 is 2 × 3 × 5 × 7 = 210. Although STP 53's sequence length fits in an unsigned 64-bit integer, its range doesn't.

density (prime) = seq-length (prime) / range (prime)

Density, in this case, is a measure of how many numbers in a range are actually tested as divisors. The lower the density the better. If even numbers are skipped and odd numbers are tested (STP 2), the density is 50%. Higher skip-through primes lower the density even further at the expense of increasingly lengthy sequences of differences.

max-difference (next-prime) = current-prime × 2

Sometimes! The maximum differences in the sequences for skip-through primes
[2,] 3, 5, 7, 11, 13, 17, 19, and 29, 31 obey this relation. For example,
the maximum difference in STP 31's sequence is 29 × 2 = 58. However,
skip-through prime 23's maximum difference is 40 (not the expected
19×2 = 38) and STP 37's is 66 (not the expected 31×2 = 62).
Using `nbrPrimeSeqDiffsMax()`

to generate the one trillion
(10^{12}) differences in STP 37's sequence and determine the
maximum difference originally consumed 99%-100% of the CPU for 8.5 days.
I subsequently optimized the function by skipping both even divisors and
odd divisors divisible by 3, reducing the time to 6.4 days. I'm not
a glutton for punishment, so I did not give serious consideration to
optimizing `nbrPrimeSeqDiffs()`

and
`nbrPrimeSeqDiffsMax()`

further by using higher skip-through
primes than 3. In any case, finding out STP 41's maximum among its 44
trillion differences would take three-quarters of a year or more. Maybe
there's a mathematical relation that gives the correct maximum difference
for all skip-through primes, but nothing jumped out at me.

Knowing a sequence's maximum difference--and using differences in the first place--can significantly reduce memory usage. For unsigned 64-bit integers, the largest skip-through prime, 47, has an estimated maximum difference of 43 × 2 = 86, which can be represented in an unsigned 8-bit integer. STP 47's sequence of differences requires only one-eighth of the memory that the sequence of offsets in the math section below requires. (The sequence range for STP 53 overflows 64 bits.) Looking at unsigned 128-bit integers, the largest skip-through prime is 101 with an expected maximum difference of 202, which, if correct, can still be represented in an unsigned 8-bit integer. (This is all theoretical for the most part; the largest skip-through prime I've actually tested was 29 with its one-billion-element sequence.)

Time for some actual numbers to get a sense of what we are dealing with. The following numbers are for unsigned 64-bit integers (L/R is the length divided by the range; i.e., density):

Skip- Estimated Sequence Range Through Start Maximum (Sum of Prime Divisor Sequence Length Difference Differences) L/R ------- ------- --------------- ---------- -------------- --- 2 3 1 2 2 50% 3 5 2 4 6 33% 5 7 8 6 30 26% 7 11 48 10 210 22% 11 13 480 14 2310 20% 13 17 5760 22 30030 19% 17 19 92160 26 510510 18% 19 23 1658880 34 9699690 17% 23 29 36495360 38 223092870 16% 29 31 1021870080 46 6469693230 15% 31 37 30656102400 58 200560490130 15% 37 41 1103619686400 62 7420738134810 14% 41 43 44144787456000 74 304250263527210 14% 43 47 1854081073152000 82 13082761331670030 14% 47 53 85287729364992000 86 614889782588491410 13% 53 59 4434961926979584000 94 |

(The **primal**
program computes the range as both an integer and a floating-point number.
If the integer overflows, the floating-point number is substituted in the
density calculation. So the density for skip-through prime 53 is correctly
listed as 13%.)

For unsigned 128-bit integers, the following figures pick up with skip-through prime 53 (whose range overflowed 64 bits) and run through skip-through prime 101. The density at this point has only dropped down to 11.9%. (The sequence length and range both overflow for STP 103.)

53 (Skip-Through Prime) 59 (Start Divisor) 94 (Est. Max. Diff.) Length: 4434961926979584000 (4.43496e+18) Range: 32589158477190044730 (3.25892e+19) ... through ... 101 (Skip-Through Prime) 103 (Start Divisor) 202 (Est. Max. Diff.) Length: 27739969042773783995307880611840000000 (2.774e+37) Range: 232862364358497360900063316880507363070 (2.32862e+38) SIZE_MAX: 340282366920938463463374607431768211455 (3.40282e+38) 103 (Skip-Through Prime) 107 (Start Divisor) Length: |

Knowing the exploding lengths and ranges for the higher skip-through
prime, whether for 64-bit or 128-bit numbers, was of intellectual
interest to me, but these higher skip-through primes are obviously
not of any practical value.
The Prime Page's Glossary entry for
wheel
factorization says, in different terminology than mine, that
skip-through prime 251 reduces the density to 10% and 75,037 reduces
it to 5%. Skip-through prime 134,253,593, with a sequence length of
approximately 6.4×10^{99}, drives the density down to 3%!

"Limit checking" refers to not testing divisors greater than the square root of the number when determining if the number is prime.

First of all, heed my warning at the very beginning about the danger
of overflow using the widely seen *i*×*i* ≤ *n* check with fixed-precision
integers. The following explanation uses 64-bit unsigned longs, but
the warning applies to any bit width.

The integer square root of the maximum representable, 64-bit value,
ULONG_MAX, is 0x0FFFFFFFF — all of the lower 32 bits set. (The real
square root of ULONG_MAX is 0x0FFFFFFFF plus about 0.9 decimal.)
If you square the integer square root, the result is a number that
is 8.6 billion less than ULONG_MAX. In other words, the 8.6 billion
numbers in the range [*root*^{2}..ULONG_MAX] all have the same integer
square root: 0x0FFFFFFFF. There are 193 million primes in that range.
Consequently, for each of the 193 million primes tested with divisor *i*:

*i*will reach 0x0FFFFFFFF,*i*×*i*with that value will always be less than*n*(never equal because if*n*==*root*^{2},*n*is not prime),*i*will be incremented past 0x0FFFFFFFF, and*i*×*i*will always overflow.

So 193 million primes are excluded from being tested because of the *i*×*i*
multiplication. The steps above will occur when testing every possible
divisor (*i*++) or only odd divisors (*i* += 2).

Additional optimizations in the algorithm make the situation even worse.
If the algorithm is extended to also skip divisors divisible by 3 (my first
optimization), the divisor is incremented alternately by 2 and by 4. Since
0x0FFFFFFFF is divisible by 3 and will be skipped, *i* will only reach
the preceding odd number, 0x0FFFFFFFD, and then be incremented by 4 to
0x100000001, causing *i*×*i* to overflow. The square of
the preceding odd number is 25.8 billion less than ULONG_MAX. Testing the
primality of the 579 million primes in that range — those whose
integer square root is 0x0FFFFFFFD, 0x0FFFFFFFE, or 0x0FFFFFFFF —
is impossible.

To avoid overflow with fixed-precision integers, divide both sides of
the relational operator by *i*:

if (i×i≤n) ... continue checking divisors ... (INCORRECT) if (i≤n/i) ... continue checking divisors ... (CORRECT)

Now that there is no danger of overflow, there are some other aspects of limit checking in relation to my skip-through prime optimization that need to be examined. Remember that the algorithm is basically two nested loops, where the inner loop runs through the elements in the sequence of differences and the outer loop repeats the inner loop as often as necessary to reach and pass the square root of the number (if the number is prime):

div = STP's start divisor while (div ≤ n/div) { foreach difference in sequence { ... test if n is evenly divisible by div ... div = div + difference } }

In the Wikipedia pseudocode for the related *c*#*k*+*i*
algorithm (see the mathematics section below), the "inner loop"
corresponding to my inner loop tests the same number of divisors as mine
does, but without limit checking the divisors. (I put "inner loop" in
quotes because the pseudocode is hard-coded for the STP 3 optimization
and the "loop" is a single IF-statement that checks both the 6(*k*-1)+5
and "6*k*+1" divisors; again, the *c*#*k*+*i* version
of my STP 3 optimization is explained in the mathematics section.) It is
a good idea, however, to limit check each divisor inside the *inner*
loop as well as the outer loop:

div = STP's start divisor while (div ≤ n/div) { foreach difference in sequence { ... test if n is evenly divisible by div ... div = div + difference if (div > n/div) break } }

(Whether the check is at the beginning or end of the inner loop, there will always be one check that duplicates the check in the outer loop on each pass through the sequence.)

Because the length of the sequence and range increase rapidly with higher skip-through primes, there are two cases that are of concern when there is no limit check in the inner loop:

The primality test must run through the entire sequence of differences. If the divisor exceeds the square root early in a pass through the sequence, doing all the remaining tests in the sequence is a waste of time. Assume the function uses STP 11 with its 480-difference sequence. If on some cycle the divisor surpasses the square root at the 5th difference, the inner loop will uselessly continue checking the other 475 divisors, all of which are greater than the square root.

If the range of a sequence is greater than a prime number, continued testing of divisors past the square root might reach a point where the prime number itself is tested as a divisor. "n%n == 0", so the number will be rejected as composite even though it's really prime!

The first case above is a matter of efficiency; the second case requires
flagging a number as composite if "n%div == 0" *and* if "n != div".

Adding the limit check, "div > n/div", to the inner loop caused a big performance hit. My first attempt at ameliorating the slow-down was to add a global boolean variable that the calling program could set to enable or disable limit checking in the inner loop:

if (gCheckLimits && (div > n/div)) ...

Testing a boolean flag takes no time at all and, when limit checking is disabled, the algorithm is as fast as the original no-limit-checking algorithm. However, I was not satisfied with burdening the calling program with the responsibility of deciding whether to turn limit checking on or off.

My second attempt took advantage of a sequence's range to put off limit checking until the tested divisor got "close" to the square root of the input number. First, I computed a rough estimate of the square root by repeatedly dividing the number by 2 until "root ≤ n/root". Then, in the outer loop, before beginning the inner loop each time, figure out if the rough root estimate falls within the range of numbers covered by the coming pass through the sequence of differences. If not, clear a local boolean variable to disable limit checking in the inner loop. If the estimated root does fall in the range, set the boolean variable to enable checking.

... compute rough estimate of square root ... checkLimits = false div = STP's start divisor while (div ≤ n/div) { if (div ≤ root ≤ (div+range-1)) checkLimits = true foreach difference in sequence { ... test if n is evenly divisible by div ... div = div + difference if (checkLimits && (div > n/div)) break } }

A big improvement without intervention by the calling program, but ...

The estimate of the square root could be off by as much as half of the actual square root, so limit checking might begin multiple range blocks before the range containing the actual square root. Obviously, it would be faster to only check limits in the last block.

The large range of a higher skip-through prime's sequence may be greater than the estimated or actual square root and limit checking would then begin with the very first divisor of the very first pass through the sequence; i.e., the skip-through prime's start divisor. Thus the algorithm reverts to the slow check-every-divisor mode.

A more accurate estimate of the square root was needed, so I searched for "fast integer square root" on the web and, not unsurprisingly, up popped numerous references to a technical bulletin, "Fast Integer Square Root", by Ross M. Fosler. (I added an "s" to the title for my own description of the very fast algorithm and how I implemented it, "Fast Integer Square Roots".) My is-prime function now knows exactly when the tested divisors pass the actual square root of the number and the function no longer needs to perform the time-consuming "div > n/div" check in the inner loop:

root = nbrUnsignedSQRT (n) div = STP's start divisor while (div ≤ root) { foreach difference in sequence { ... test if n is evenly divisible by div ... div = div + difference if (div > root) break } }

Performance tests were conducted using my
**primal**
test program.
The command lines are shown below. The "output" following
the command lines are not the output of the program, but me
tabulating the results of multiple runs. When **primal**
is in "-time" mode, most output is suppressed so that I/O is not
a significant factor in the times. Also, the generation of
sequences of differences is done before the timer starts counting.

My original prime number function was in my HASH_UTIL package and the largest hash table I ever needed was for about 80,000 telemetry points. (Most of those points were variable names tied to memory locations in the onboard computer and would only be downloaded, selectively, when trying to diagnose a problem.) So my first suite of tests listed all the prime numbers in the range 80,000-200,000. Since listing those numbers is very quick, I had each run repeat the test 1000 times to get a large enough timing figure for comparison's sake.

Determine the prime numbers in the range 80,000-200,000; repeat 1000 times:

% primal -time -repeat 1000 -list 80000-200000 -alternate odds STP 2* 209.9 seconds * hand-coded loop to check odd divisors % primal -time -repeat 1000 -list 80000-200000 -qprime

The first test ("-alternate odds") used a run-of-the-mill, hand-coded loop to check all odd divisors. The remaining tests ("-qprime*stp*STP 2 221.3 seconds STP 3 127.5 seconds STP 5 87.0 seconds STP 7 71.0 seconds STP 11 65.5 seconds STP 13 62.2 seconds STP 17 60.3 seconds STP 19 59.1 seconds STP 23 58.5 seconds STP 29 58.0 seconds*stp*") used my sequence of differences, hence the slightly higher time for STP 2. As expected, the times decrease as the skip-through prime increases.

The second suite of tests determines the 1,000 largest prime numbers
representable in unsigned, 64-bit integers. These numbers are verified
by matching them against a pre-computed list of the primes generated by
a different program, Achim Flammenkamp's **prime_sieve**,
found at his web page:

"The Sieve of Eratosthenes"

http://wwwhomes.uni-bielefeld.de/achim/prime_sieve.html

[Modifications to his program: *(i)* define LONG as "unsigned long",
*(ii)* undefine _ANSI_EXTENSION, *(iii)* uncomment the
commented-out code in the definition of "use(x)", and *(iv)*
hard-code the initial sieve size.]

Determine and verify the largest 1,000 prime numbers representable in unsigned, 64-bit integers against a pre-computed (by a different program) list of those primes:

% primal -time -verify last64 -qprime

*stp*STP 3 90,842 seconds (25.2 hours) STP 5 55,797 seconds (15.5 hours) STP 7 43,093 seconds (12.0 hours) STP 11 38,338 seconds (10.6 hours) STP 13 35,310 seconds ( 9.8 hours) STP 17 33,234 seconds ( 9.2 hours) STP 19 31,667 seconds ( 8.8 hours) STP 23 30,238 seconds ( 8.4 hours) STP 29 29,430 seconds ( 8.2 hours)

The most time-consuming aspect of my algorithm is generating the sequence
of differences for the higer skip-through primes. My computer was able
to generate a physical, 1-billion-byte sequence of differences for STP 29,
but `malloc(3)`

failed trying to allocate the 31-billion-byte
sequence for STP 31. The `nbrPrimeSeqDiffsMax()`

function
generates the same sequences of differences, but without requiring any
memory for the sequence: each difference is generated, compared to the
running maximum value, and then discarded. So I was able to time sequence
generation for STPs 31 and 37; the last took 6.4 days and I didn't have the
patience to wait 280+ days for the generation of STP 41's 44-trillion-element
sequence to complete.

Generate sequences of differences:

% primal -time -qmax

*stp*STP 17 0.04 seconds 92 thousand elements STP 19 0.4 seconds 1.7 million elements STP 23 11 seconds 36 million elements STP 29 407 seconds (6.8 minutes) 1 billion elements STP 31 12,602 seconds (3.5 hours) 31 billion elements STP 37 550,670 seconds (6.4 days) 1 trillion elements

When the code was pretty much in its final form, bar tweaking of the
limit checking, I researched ways to speed up generating sequences
of differences (e.g., originally 8.5 days for skip-through prime 37's
one trillion differences, a 10^{12}-length sequence). This
research was unsuccessful, but I came upon many queries on programming
forums about determining if a number is prime and many responses about
an optimized method called "6k +/- 1". I then searched for information
about 6k +/- 1 and landed on the Wikipedia page about primality testing
using "trial division", i.e., looping through all possible divisors
to find a divisor other than 1 for the number. From Wikipedia
(Wikipedia: Primality test - Simple methods):

The algorithm can be improved further by observing that all primes greater than 6 are of the form 6

k± 1. This is because all integers can be expressed as (6k+i) for some integerkand fori= -1, 0, 1, 2, 3, or 4; 2 divides (6k+ 0), (6k+ 2), (6k+ 4); and 3 divides (6k+ 3). So, a more efficient method is to test ifnis divisible by 2 or 3, then to check through all the numbers of the form 6k± 1 ≤ √n. This is 3 times as fast as testing allm.Generalising further, it can be seen that all primes greater than

c# are of the formc#k+ifori<c# wherecandkare integers,c# represents c primorial, andirepresents the numbers that are coprime toc#. For example, letc= 6. Thenc# = 2 ⋅ 3 ⋅ 5 = 30. All integers are of the form 30k+ifori= 0, 1, 2, …, 29 andkan integer. However, 2 divides 0, 2, 4, …, 28 and 3 divides 0, 3, 6, …, 27 and 5 divides 0, 5, 10, …, 25. So all prime numbers greater than 30 are of the form 30k+ifori= 1, 7, 11, 13, 17, 19, 23, 29 (i.e. fori< 30 such that gcd(i,30) = 1). Note that ifiand 30 were not coprime, then 30k+iwould be divisible by a prime divisor of 30, namely 2, 3 or 5, and would therefore not be prime.[Last sentence: ... for ex:- 437 ... I learn something new every day!]

So all primes are of the form *c*#*k*+*i* for i meeting the conditions above.
(Note that the reverse is not true; numbers of the form *c*#*k*+*i* are not
necessarily prime.) Relating this to my code:

*c* is my skip-through prime. (Strictly speaking, *c* can be
any number in the range STP ... *next-prime* - 1; in other
words, STP is the largest prime less than or equal to *c*.)

*c*# is "c primorial", where primorial is simply the product of
all the primes less than or equal to *c*; i.e., 2x3x5x7x...
This value is identical to my "range" of a sequence of
differences. (Wikipedia: Primorial)

*k* indexes the blocks of range *c*# of divisors to be checked.
Incrementing *k* is like reaching the end of a sequence of
differences in my code and looping back to start at the
beginning of the sequence for the next higher block of
divisors.

*i* is drawn from the set of numbers in the effective
range 1..*c*#-1 such that each *i* is coprime with
*c*#. Two numbers are *coprime* with respect to
each other if they share no divisors other than 1. Note that
"coprime" is a relation between numbers, not an attribute
(like "prime") of a single number.
(Wikipedia:
Coprime integers)

So the *c*#*k*+*i* optimization is the foundation for my code, unbeknownst to
me when I wrote the code! The difference between the *c*#*k*+*i* method and
my code is that the former method works with explicit *i* offsets from the
base of a *c*#*k* block and my code works with the sequence of differences
between successive *i* offsets.

Now let's consider the set of numbers coprime with *c*# from which *i* is
drawn. The numbers are in the range 0..*c*#-1. Zero (0) drops out because
the first multiplier in *c*# is 2, so *c*#*k*+0 is always even (divisible by 2).
The very first number in the set is always 1 since the only divisor it
shares with *c*# is 1.

The second number in the set is always the next prime after *c*: the numbers
between 1 and the next prime, exclusive, are either equal to or evenly
divisible by one of the primes less than or equal to *c* (that's why the
next prime is prime!), so these numbers drop through the "sieve". NOTE
that the next prime, the second number in the set, is the start divisor
in my algorithm. My sequence of differences begins with the difference
between the second and third elements of the set and ends with the
difference between the first and second elements.

The remaining numbers in the set are determined by checking the divisors
of the numbers after the next prime. If a number is evenly divisible by
any of the primes less than or equal to *c*, the number drops through the
"sieve" and out of the set. Since the number of elements in a set rises
rapidly with high *c*'s, this is a time-consuming process. This process can
itself be optimized by using the *c*#*k*+*i* method. My comparable generation
of a sequence of differences uses a hard-coded skip-through prime of 3;
as I mentioned above, it didn't seem worth the effort to generalize that
bit of code for higher STPs.

Here are some example skip-through primes in their *c*#*k*+*i*
form. (In the first example, 1 is not prime, but the algorithm will work with
the given parameters to check every possible divisor, odd *and* even.)

c= 1c# = 1iin { 1 }c= 2c# = 2iin { 1 }c= 3c# = 6iin { 1, 5 }c= 5c# = 30iin { 1, 7, 11, 13, 17, 19, 23, 29 }c= 7c# = 210 48 elements in setiin { 1, 11, 13, 17, 19, 23, 29, 31, ..., 199, 209 }c= 11c# = 2310 480 elements in setiin { 1, 13, 17, 19, 23, 29, 31, 37, ..., 2297, 2309 } ...c= 23c# = 223,092,870 36,495,360 elements in setiin { 1, 29, 31, 37, 41, ..., 223092841, 223092869 }c= 29c# = 6,469,693,230 1,021,870,080 elements in setiin { 1, 31, 37, 41, 43, ..., 6469693199, 6469693229 }

Note that the algorithm has to ignore the very first potential divisor:
when *k* is 0 and *i* is 1, *c*#*k*+*i* = 1 and any number being divisible by 1
does not mean the number is composite. Effectively, the first divisor
checked is the next prime after *c*, my start divisor. (Of course, this
is after the initial check if the target number is equal to or divisible
by the primes less than or equal to *c*).

I just noticed that the difference between the first two elements in a set
equals the difference between the last two elements, at least up to *c*=29.
I don't know if there's a mathematical reason for this or if it holds
true for all possible sets.

You might be wondering how the "6*k* +/- 1" form widely seen on
the internet relates to the *c*#*k*+*i* scheme of
6*k*+*i* for *i* in { 1, 5 }. Swap the arithmetic
operations and "6*k* -/+ 1" becomes shorthand for the +5 term of
the previous *k* block and the +1 term of the current *k* block:

c= 3c# = 2×3 = 6iin { 1, 5 } Previouskblock: 6(k-1)+5 = 6k-6+5 = 6k-1 Currentkblock: 6k+1

A graphical representation, I guess you would call it, of the
*c*#*k*+*i* optimization is called wheel factorization
(Wikipedia:
Wheel factorization). Like my algorithm, the algorithm presented
on the Wikipedia page *(i)* explicitly begins checking divisors
with the next prime after *c* and *(ii)* uses the
differences between divisors rather than the *i* offsets.