mpn_sqrtrem1

Niels Möller nisse at lysator.liu.se
Tue Dec 20 10:54:21 UTC 2016


"Marco Bodrato" <bodrato at mail.dm.unipi.it> writes:

> Il Lun, 19 Dicembre 2016 6:21 pm, Adrien Prost-Boucle ha scritto:
>> I noted that GMP fallback function umul_ppmm(), in longlong.h in GMP code,
>> uses 4 multiplications where the Karatsuba method would only requires 3,
>> I was wondering whether optimization was possible...
>
> Reducing the number of multiplications is possible... but I bet a
> Karatsuba umul_ppmm() is not faster than the plain version (at least not
> on current 64-bits CPUs ;-)

Right. Karatsuba for small numbers is adds quite some overhead dealing
with either sign (if using the a1-a0 variant) or carry (if using the
a1+a0 variant. In this case when we split in half-limbs, one could use a
full limb to represent the sum, but that won't help since the
multiplication (a1+a0)*(b1+b0) can still overflow a single limb).

So it will pay off only on architectures where multiplication is much
much slower than other arithmetic operations.

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.


More information about the gmp-devel mailing list