Update: Schönhage 's algorithm for fast integer GCD
Torbjorn Granlund
tege at swox.com
Tue Sep 16 23:05:54 CEST 2003
Ok, now I have rewritten the code yet one more time, and I've also
done some more analysis of the sizes to expect and the number of
recursive calls. The current code is still memory hungry (linear in
the size of the input, with a fairly large constant), but I think that
it can be cut down to reasonable sizes. The new code is a little
faster than the previous version, even if it hasn't undergone careful
tuning yet (the basecase thresholds are at 20 limbs, compared to 128
when benchmarking the previous version).
You have your own basecase code, I presume? Could the existing
mpn_gcdext code be used as basecase code too?
Timings, relative to gmp-4.1.2 mpz_gcd, on sparc, are included below.
I think a venerable sparc v7 is a poor timing platform for GMP.
Please consider using a platform with integer multily support,
and more modern pipeline characteristics. Anything non-sparc
less than 10 years old will be better. :-/
The new code it wins over gmp-4.1.2 at around 1500 limbs on fibonacci
numbers, and at around 500 limbs for "random" numbers. For comparison,
what are typical thresholds for FFT multiplication?
Between 1000 and 1500 for sparcv7 and sparcv8 chips.
Most of the time is spent in computations on two-limb numbers (that's
div2 and nhgcd2) and book-keeping of one-limb quotients
Is that the same div2 as of the existing mpn_gcdext? If not,
what does it compute?
--
Torbjörn
"Om du kallar mig rättshaverist en gång till stämmer dig för förtal"
More information about the gmp-devel
mailing list