mpn_mul is embarrassingly slow
nisse at lysator.liu.se
Tue Apr 24 13:49:35 UTC 2018
tg at gmplib.org (Torbjörn Granlund) writes:
> What do you think about this stopgap change? The idea is to speed up
> small operands, adding very little overhead to larger operands.
Looks reasonable to me. Assuming speedup for small operands is
> (We should really get rid of mpn_mul's return value for the slightly
> incompatible GMP 7; that will allow cheap tail calls here.)
I would prefer the opposite change for GMP7, to have all multiplication
functions return, but *not* store, the high limb of the product. Which
also should work nicely with tail calls.
> + /* If un, and thus vn, are below the toom22 range, drop into mul_basecase.
> + Test un and not vn here not to thwart the code handling un >> vn below. */
I had to look up what clever things we try to do when vn is small but un
is huge. It seems we try to improve locality by breaking U up in pieces
which hopefully fit the cache.
> if (un == vn)
> ! mpn_mul_n (prodp, up, vp, un);
I wonder if calling mpn_mul_n from here still is useful? Given that both
basecase and toom functions support unbalanced operands, and we don't
even have any mpn_toom*_mul_n. Main thing that makes mpn_mul_n nicer is
that thresholds get a lot easier with only a single size.
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel