fast gcd
Paul Zimmermann
Paul.Zimmermann at loria.fr
Tue Oct 13 16:32:41 CEST 2009
Hi,
since everybody on this list speaks currently only about multiplication,
let's speak about gcd.
Among the feedback we got on our book [1], someone asked us to provide a proof
of the HalfGcd algorithm. We thus will give such a proof in the next version,
for the binary/2-adic/LSB/right-to-left variant.
I also wrote an implementation of that algorithm in GMP, which is available
from the "Software" section of [1]. Here are some timings on a 2.83Ghz Core 2
with GMP 4.3.1 (N is the number of 64-bit limbs of the two random inputs):
N mpz_gcd mpz_bgcd
100000 2544ms 3123ms
200000 6152ms 7170ms
500000 19367ms 21296ms
1000000 45690ms 48480ms
2000000 106876ms 112167ms
5000000 323279ms 333767ms
10000000 738496ms 725144ms
The differences are as follows:
(a) mpz_gcd uses the classical MSB division, while mpz_bgcd uses the 2-adic
division, which leads to a size increase of about 2.5%, as analyzed by
Daireaux, Maume-Deschamps and Vallée in [2].
(b) mpz_bgcd (or more exactly the inner hgcd routine) only computes the 2x2
matrix, not the corresponding terms of the remainder sequence, which are
recomputed if needed from that matrix
(c) for the matrix-vector multiplication in (b), mpz_bgcd uses the wrap-around
trick together with FFT mod 2^N+1
Since (b) and (c) can be done in the classical case too, I believe the gcd code
in GMP can still be improved.
Paul Zimmermann
[1] Modern Computer Arithmetic, Richard Brent and Paul Zimmermann,
version 0.3, June 2009, http://www.loria.fr/~zimmerma/mca/mca-0.3.pdf.
[2] users.info.unicaen.fr/~brigitte/Publications/daireaux-maume-vallee-long.pdf
More information about the gmp-devel
mailing list