|
|
|
NOTE: The integer factorization algorithms tested with this program are described in greater detail in "Integer/Prime Factorization".
frize determines the prime factors of a number, a process called "prime factorization", no less.
frize was not my first choice for this program's name, but I couldn't think of a second choice. It is pronounced "Fries", as in "Freedom Fries", a term that replaced "French Fries" in the U.S. House of Representatives' cafeterias in 2003. GOP congressman Bob Ney pushed this change; as far as I know, he expressed no opinion on "French Kissing". It is rumoured that, when he heard someone call a French person a "cheese-eating surrender monkey" (thank you, Groundskeeper Willie!), Ney said, "Here, hold my beer." GOP politicians are very competitive and will say things like, "I'm more of a surrender monkey than a Frenchmen any day of the week and twice on Sunday", and "I'm the limbo champion when it comes to low IQs." The fries became French again — Hercule Poirot: "I beg to differ!" — less than 3 years later when Ney had to resign prior to being tried, convicted, and sentenced to prison for corruption.
For example, the prime factors of 126 are:
2 × 3 × 3 × 7
A given prime factor can appear more than once in a number's list of factors. Composite numbers have more than one factor. Prime numbers have a single factor — themselves.
I got interested in prime factorization after accidentally coming across Wikipedia's "Trial Division" page and its generic C++ function for generating factors. There was a single phrase, "without squaring f" (the factor), in the introductory sentence and a single incorrect comment in the code. There was no explanation of the algorithm, no explanation of the template parameters and, even worse, two of the most important variables had meaningless names. My initial interest in the C++ function was if it suffered from possible overflows (which had been of concern in my prime number functions), but I had to figure out what the code was doing before I started investigating overflows.
Once I understood the C++ function's algorithm, I began to develop optimizations and also a way to avoid overflows (which are possible). frize tests and measures the performance of the following prime factorization functions:
"diff" - is a readable, commented C version of Wikipedia's C++ function, which also includes checks for overflow. Both of them are distinguished by the keeping of running values of (i) squares and (ii) the differences between squares. The square of the next odd factor to be tested is obtained by simply adding the difference from the last square. Wikipedia's "without squaring f" comment presumably means that the C++ function doesn't multiply the factor by itself, "f*f". However, the function still computes the square of the factor, albeit without a direct multiplication, thus resulting in the possibility of overflow.
"square" - is a slimmed-down version of "diff" that computes the differences between squares on the fly. The difference between the squares of one odd factor, f, and the next, f+2, is always 4f+4, or (f+1)×4. The updating of the differences by an addition operation in "diff" is replaced by a multiplication and an addition. (In other words, the difference variable is eliminated and the new square is computed directly: "sq=sq+(f+1)×4".) An optimizing compiler will convert the multiplication by 4 into a 2-bit left shift (as GCC does).
"root" - takes a completely different approach than "diff" and "square". In simplified terms, the latter two functions limit the number of factors tested by comparing the square of the next factor to the number being factored. As a result, these functions may iterate on factors past the square root of the number being factored. For very large numbers, doing so can result in the square overflowing. The "root" function eliminates the possibility of overflow by computing the fast integer square root of the number to be factored and using that root as the limit for factors. The square of a factor is not needed and is not computed.
After coding and testing the functions above, I tried applying the common 6k±1 prime number optimization to these functions; i.e., don't test odd factors divisible by 3. The functions only test 2 out of every 3 odd factors, which reduces the time to factor a number by a third.
"square3" - the "square" method with the 6k±1 optimization.
"root3" - the "root" method with the 6k±1 optimization.
The "root3" algorithm combines speed with no overflows, so
it became the nbrFactorsOf()
function in my
NBR_UTIL package.
% frize [-debug] [-help] [-after count] [-early count] [-mask width] [-overflows] [-progress divisor] [-repeat count] [-time] number[,diff|root|root3|square|square3]
where
-debug
- turns debug on.
-help
- prints out a help message.
-after count
- triggers something happening after the specified number of factors have been tested. The program must be modified to compare this count to the number of tests and to do something when the count is reached. Currently, it is only referenced and used in the "diff" algorithm. To factor a very large 64-bit prime, it is necessary to test over 2 billion potential factors. I needed to see the values of the "diff" function's internal variables for the last few tests before the number is found to be prime, so, when the after count is encountered, the "diff" function enables debug (
nbr_util_debug
is set to 1).-early count
- instructs the algorithms to test the first count prime numbers as possible factors before falling back on the laborious testing of all odd factors or all odd factors not divisible by 3. frize has a built-in table of the first 6,542 primes. By default, all of these primes (or as many as necessary) are tested. Using a lower count can be useful in performance testing. In particular, the
nbrFactorsOf()
function in the LIBGPL library has a smaller table of the first 512 prime numbers. To compare the performance of that function to the performance of the algorithms in frize, "-early 512
" should be specified on the command line.-mask width
- specifies the bit width of a mask to be applied in the algorithms that are subject to overflow ("diff", "square", and "square3"), thus providing the capability to cause artificial overflows. The calculations in the factoring functions are made using the factor_t type, defined in "
nbr_util.h
". The width of the mask must be less than the bit width of the positive portion of the factor_t type. The mask zeroes out all but the specified number of least significant bits in the results of calculations. For example, if factor_t is 64 bits wide, unsigned 32-bit integers can be simulated with a mask width of 32. Also see the "-overflows
" option below.-overflows
- allows overflows to occur in the algorithms that are subject to overflow ("diff", "square", and "square3"). By default, when one of these functions detects an overflow, an error message is displayed and the function returns immediately. With the "
-overflows
" option, a debug message is logged (if "-debug
" is on) and the function continues processing with the overflowed value. Note that continuing with the bad values likely will cause the function to test orders of magnitude more factors, so be prepared to wait. Also see the "-mask width
" option to see how to cause overflows with smaller values.-progress divisor
- instructs the program to print progress messages out to stderr as a number is being factored. The factoring functions keep count of the number of tests of possible factors. When the number of tests is evenly divisible by divisor, a progress message is output. For example, when factoring the largest 32-bit prime with the "diff" algorithm, with a mask of 32 bits, and with overflows allowed, I had the function output progress messages every 100 million tests. (Because of the overflows, the function tests over 2 billion factors.) These messages are only generated if an algorithm is specified on the command line after the number to be factored; if not, the program calls the LIBGPL function
nbrGetFactors()
which doesn't support this capability. (The command-line algorithms are implemented in this source file; if you need progress messages for the LIBGPL function, use its equivalent algorithm, "root3".)-repeat count
- repeats the factorization the specified number of times. This is useful when a single factorization of the input number is too quick to provide a meaningful time. Repeating the factorization thousands or tens of thousands of times allow more realistic timing comparisons between algorithms, for example.
-time
- measures and displays the time it takes to perform factorization (or count iterations of the computation if "-repeat count" is also given on the command line). Almost all output is suppressed to keep I/O from affecting the timing.
- number[,
diff
|root
|root3
|square
|square3
]- generates the prime factors of the number. The optional argument after a comma (no spaces!) specifies an alternate, prime factorization function (all of which are compiled into this program):
I used this capability for benchmarking the different algorithms. You can abbreviate the names to any length, so I usually just specify one letter; e.g., "frize 126,s" will generate the prime factors of 126 using the "square" function. The skip-3 functions, "square3" and "root3", can also be abbreviated to "s3" and "r3", respectively. If the optional argument is not present, the default function is LIBGPL's "root3" function,
- "diff" - uses an optimized version (I think!) of the C++
TrialDivision
function on Wikipedia. The function keeps track of the factor, the square of the factor, and the differences between the squares of consecutive odd factors.- "square" - is a slimmed-down version of the "diff" function that does not keep track of the differences between squares. The square of the next odd factor following the old factor, f, is f²+4f+4. This function simply adds 4f+4 to the current square to get the next square; it is not necessary to keep a running value for the difference. And an optimizing compiler can replace 4f with f<<2.
- "square3" (or "s3") - is the "square" method modified to skip odd factors divisible by 3. Testing only 2 out of every 3 odd factors reduces the time to factor a number by a third.
- "root" - uses a fast integer square root function to get the upper bound for the factor, thus eliminating the possibility of overflows (and also the need to keep track of the square of the factor).
- "root3" (or "r3") - is the "root" method modified to skip odd factors divisible by 3. Testing only 2 out of every 3 odd factors reduces the time to factor a number by a third.
nbrFactorsOf()
.