Schönhage's algorithm for fast integer GCD
Niels Möller
nisse@lysator.liu.se
16 Jun 2003 18:23:54 +0200
nisse@lysator.liu.se (Niels Möller) writes:
> The code I've written seems to work (see testsuite.c). Taskss that
> still need to be done to make it a good and practical algorithm for
> GMP include
>
> * A fast base case, probably a variant of the hgcd function that uses
> Lehmers algorithm, including some or all of the speedups from Tudor
> Jabelean's paper "Improving the multiprecision euclidean
> algorithm".
>
> * More analysis of the sizes of the temporary results, to guarantee
> that the algorithm terminates, and to get the right allocation for
> the next task.
>
> * Converting the code to use the mpn interfaces.
I've now tried to address the second task, and generally improved the
code. I also tried to use a result from Jebelean's paper, which
*almost* fits nicely with the algorithm:
Let k = 2n, A = a 2^k + A', B = b 2^k + B'
Define quotient sequences:
A_0 = A, A_1 = B, Q_{i-1} = floor(A_i/A_{i-1})
a_0 = a, a_1 = b, q_{i-1} = floor(a_i/a_{i-1})
Then, if
a_{i+2} >= 2^n for i = 1, 2, ..., k,
then
Q_i = q_i, for i <= k,
(U_i, V_i) = (u_i,v_i), for i <= k+1
and we can compute
A_k = u_k A + v_k B
A_{k+1} = u_{k+1} A + v_{k+1} B
This is true for all values of n and all A', B' < 2^k.
The problem is that when we have computed A_k and A_{k+1} by recursing
over a and b, we need to ensure that also A_{k+2} > 2^m before
returning, and that is not always the case. Then we would need to back
up and return A_{k-1}, A_k instead, and that seems hard. I've tried
bend the equations around, returning earlier A_k, to no avail.
Instead, the current code doesn't care if some q_i and Q_i differ, as
long as it gets numbers of the appropriate sizes it goes on. The end
result is a linear combination
g' = u a + v b
which obviously is a multiple of gcd(a,b). So I check if it divides a
and b, and if not I have have to use a more conservative algorithm
like lehmer's to get rid of spurious factors. I think g' / gcd(a,b) is
limited, in fact with the current code I haven't yet observed a g'
that is different from the true gcd.
Updated version at
http://www.lysator.liu.se/~nisse/misc/gcd-0.1.tar.gz.
I think it should be fairly straight forward to convert it to use mpn,
I think the analysis is the same for all bases, including both 2 and
2^BITS_PER_LIMB.
Regards,
/Niels