Schönhage's algorithm for fast integer GCD

Niels Möller nisse@lysator.liu.se
13 Jun 2003 22:12:37 +0200


I've been looking at how to get a fast gcd algorithm. The one best
known seems to be Schönhage's algorithm, first described in an article
in Acta Informatica 1971, in German.

Later descriptions I've seen or seen references to include Aho,
Hopcroft and Ullman's algorithm book, a paper "Fast computation of
GCD:s" by Moenck, and a paper by djb. All of these focus on the
polynomial case, and the integer case has some more subtleties.

I've also read a paper "Parallel implementation of Schönhage's integer
GCD algorithm" by Giovanni Cesari, which includes some pseudo code of
the actual algorithm for integers.

The code I've written seems to work (see testsuite.c). Taskss that
still need to be done to make it a good and practical algorithm for
GMP include

 * A fast base case, probably a variant of the hgcd function that uses
   Lehmers algorithm, including some or all of the speedups from Tudor
   Jabelean's paper "Improving the multiprecision euclidean
   algorithm".

 * More analysis of the sizes of the temporary results, to guarantee
   that the algorithm terminates, and to get the right allocation for
   the next task.

 * Converting the code to use the mpn interfaces.

I haven't done any benchmarks yet (that's mostly pointless without a
good basecase), but my gut feeling is that the algorithm can be
competitive for moderately large numbers, say about a few hundred
limbs.

The code is available at
http://www.lysator.liu.se/~nisse/misc/gcd-0.1.tar.gz.

Below, I include the introductory comment from the source code, which
I hope will explain the main idea which makes the algorithm work.

Happy hacking,
/Niels

/* Introduction
 *
 * Consider this gcd algorithm, with input two numbers a and b, both
 * of size 2n bits. Split them into halves,
 *
 *  a = a_1 2^n + a_2, b = b_1 2^n + b_2
 *
 * Compute, recursively, r1 = gcd(a1, b1), and two numbers u1 and v1
 * such that
 *
 *   r1 = u1 a1 + v1 b1                                        (*)
 *
 * Form the corresponding linear combination of the original numbers,
 *
 *   r  = u1  a + v1  b
 *
 * No matter where u1 and v1 came from, r is always a multiple of
 * gcd(a,b). We get
 *
 *   r  = u1 (a1 2^ n + a2) + v1 (b1 2^n + b2)
 *      = r1 2^n + u1 a2 + v1 b2
 * 
 * The nice thing about this expression is that the first term is
 * considerably smaller than our original numbers. So we could try
 * finding the gcd (or a smaller multiple therof) by recursively
 * computing
 *
 *   gcd(r, b mod r)
 *
 * This gives us a sub-quadratic algorithm for gcd computation.
 * However, this doesn't quite work, and the reason is that r isn't
 * small. The terms u1 a2 and v1 b2 are about the same size of our
 * original numbers, so we don't get anywhere.
 *
 * To get a working algorithm, we need to choose r1, u1 and v1
 * differently. We still want equation (*) to hold, but with numbers
 * r1, u1 and v1 of about the same size, n/2 bits.
 *
 * Such numbers exist: Consider the extended euclidian algorithm
 * applied to a1 and b1, and take a remaider r1 halfway through that
 * computation, and the corresponding u1 and u2.
 *
 * We then get an
 *
 *   r  = r1 2^n + u1 a2 + v1 b2
 *
 * of size 3n/2, 3/4 of the size of the numbers we started with. And
 * that's enough to get an efficient algorithm.
 *
 * So the heart of Schönhage's algorithm is the function hgcd ("half
 * gcd"), which takes two numbers a and b as input, of size 2n bits
 * each, and recursively computes four numbers u, v, u' and v',
 * corresponding to two numbers
 *
 *   r  = u a + v b
 *   r' = u'a + v'b
 *
 * of size close to n, which corresponds to remainders close to the middle of
 * the euclidean remainder sequence for a and b.
 */