mpn_mul is embarrassingly slow
Torbjörn Granlund
tg at gmplib.org
Tue Apr 24 11:52:28 UTC 2018
What do you think about this stopgap change? The idea is to speed up
small operands, adding very little overhead to larger operands.
(We should really get rid of mpn_mul's return value for the slightly
incompatible GMP 7; that will allow cheap tail calls here.)
*** /tmp/extdiff.rFSg_D/gmp-main.02a2ec6e1bce/mpn/generic/mul.c Mon Apr 23 18:12:05 2018
--- /home/tege/prec/gmp-main/mpn/generic/mul.c Tue Apr 24 13:46:01 2018
***************
*** 125,134 ****
ASSERT (! MPN_OVERLAP_P (prodp, un+vn, vp, vn));
if (un == vn)
{
! if (up == vp)
! mpn_sqr (prodp, up, un);
! else
! mpn_mul_n (prodp, up, vp, un);
}
else if (vn < MUL_TOOM22_THRESHOLD)
--- 121,152 ----
ASSERT (! MPN_OVERLAP_P (prodp, un+vn, vp, vn));
+ /* For machines where SQR_BASECASE_THRESHOLD != 0, drop into mul_basecase for
+ both mul and sqr. For most machines, this will compile to nothing. */
+ if (BELOW_THRESHOLD (vn, SQR_BASECASE_THRESHOLD))
+ {
+ mpn_mul_basecase (prodp, up, un, vp, vn);
+ return prodp[un + vn - 1]; /* historic */
+ }
+
+ if (up == vp) /* If true, very likely squaring */
+ {
+ if (un == vn)
+ {
+ mpn_sqr (prodp, up, un);
+ return prodp[un + vn - 1]; /* historic */
+ }
+ }
+
+ /* If un, and thus vn, are below the toom22 range, drop into mul_basecase.
+ Test un and not vn here not to thwart the code handling un >> vn below. */
+ if (BELOW_THRESHOLD (un, MUL_TOOM22_THRESHOLD))
+ {
+ mpn_mul_basecase (prodp, up, un, vp, vn);
+ return prodp[un + vn - 1]; /* historic */
+ }
+
if (un == vn)
{
! mpn_mul_n (prodp, up, vp, un);
}
else if (vn < MUL_TOOM22_THRESHOLD)
--
Torbjörn
Please encrypt, key id 0xC8601622
More information about the gmp-devel
mailing list