mpn_sqrtrem1

Torbjörn Granlund tg at gmplib.org
Tue Dec 20 15:02:10 UTC 2016


nisse at lysator.liu.se (Niels Möller) writes:

  Right. Karatsuba for small numbers is adds quite some overhead dealing
  with either sign (if using the a1-a0 variant) or carry (if using the
  a1+a0 variant. In this case when we split in half-limbs, one could use a
  full limb to represent the sum, but that won't help since the
  multiplication (a1+a0)*(b1+b0) can still overflow a single limb).
  
  So it will pay off only on architectures where multiplication is much
  much slower than other arithmetic operations.
  
The set of machines where Karatsuba for umul_ppmm would give any
benefits is almost empty.  Almost all machines have some hardware
support for producing the full product.

The only CPUs I can think of where one might see any speedup with
Karatsuba are older Sun/Oracle designed SPARCs (UltraSPARC 1-4, T1-T2).
These computers are not intended for technical buyers, and we should not
let them affect our decisions.

-- 
Torbjörn
Please encrypt, key id 0xC8601622


More information about the gmp-devel mailing list