mpn_mul is embarrassingly slow

Tue Apr 24 13:49:35 UTC 2018

tg at gmplib.org (Torbjörn Granlund) writes:

> What do you think about this stopgap change?  The idea is to speed up
> small operands, adding very little overhead to larger operands.

Looks reasonable to me. Assuming speedup for small operands is
measurable.

> (We should really get rid of mpn_mul's return value for the slightly
> incompatible GMP 7; that will allow cheap tail calls here.)

I would prefer the opposite change for GMP7, to have all multiplication
functions return, but *not* store, the high limb of the product. Which
also should work nicely with tail calls.

> +   /* If un, and thus vn, are below the toom22 range, drop into mul_basecase.
> +      Test un and not vn here not to thwart the code handling un >> vn below. */

I had to look up what clever things we try to do when vn is small but un
is huge. It seems we try to improve locality by breaking U up in pieces
which hopefully fit the cache.

>     if (un == vn)
>       {
> !       mpn_mul_n (prodp, up, vp, un);
>       }

I wonder if calling mpn_mul_n from here still is useful? Given that both
basecase and toom functions support unbalanced operands, and we don't
even have any mpn_toom*_mul_n. Main thing that makes mpn_mul_n nicer is
that thresholds get a lot easier with only a single size.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.