mpn_mul is embarrassingly slow
nisse at lysator.liu.se
Tue Apr 24 19:40:29 UTC 2018
tg at gmplib.org (Torbjörn Granlund) writes:
> nisse at lysator.liu.se (Niels Möller) writes:
> I would prefer the opposite change for GMP7, to have all multiplication
> functions return, but *not* store, the high limb of the product. Which
> also should work nicely with tail calls.
> I believe this would work nicely for mul_basecase but not for toom or
> fft multiply. These latter would need extra temp space, or code
Maybe. One would have to try it out to say for sure. I had a quick look
at toom22_mul. The two balanced recursive products (evaluation at 0 and
-1) would immediately store the returned high limb.
For the third recursive multiply (high parts, interpreted as evaluation
at infinity), it's fairly natural to keep the high limb in a scalar
variable/register, to be returned after final carry propagation.
But I agree there's some potential for contortion: the final carry
propagation now uses MPN_INCR_U and MPN_DECR_U. This gets more awkward
when we want the carry to possibly spill over the edge of the memory
area and propagate into a register. The L(vinf) + H(vinf) addition may
also become less nice, but being the only unbalanced addition (mpn_add,
not mpn_add_n) in this code, it's in some way awkward already.
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel