mpn_sqrtrem1
Torbjörn Granlund
tg at gmplib.org
Tue Dec 20 15:02:10 UTC 2016
nisse at lysator.liu.se (Niels Möller) writes:
Right. Karatsuba for small numbers is adds quite some overhead dealing
with either sign (if using the a1-a0 variant) or carry (if using the
a1+a0 variant. In this case when we split in half-limbs, one could use a
full limb to represent the sum, but that won't help since the
multiplication (a1+a0)*(b1+b0) can still overflow a single limb).
So it will pay off only on architectures where multiplication is much
much slower than other arithmetic operations.
The set of machines where Karatsuba for umul_ppmm would give any
benefits is almost empty. Almost all machines have some hardware
support for producing the full product.
The only CPUs I can think of where one might see any speedup with
Karatsuba are older Sun/Oracle designed SPARCs (UltraSPARC 1-4, T1-T2).
These computers are not intended for technical buyers, and we should not
let them affect our decisions.
--
Torbjörn
Please encrypt, key id 0xC8601622
More information about the gmp-devel
mailing list