mpn_mul is embarrassingly slow

Wed Apr 25 11:47:58 UTC 2018

nisse at lysator.liu.se (Niels Möller) writes:

  For n x 20, n pretty large, what strategy does mpn_mul use? I would
  expect repeated toom32, but maybe that gives too much overhed. It seems
  we don't have any MUL_TOOM32_TO_BASECASE threshold?

I believe this is the logic triggered in this case:

	  if (4 * un < 5 * vn)
	    mpn_toom22_mul (prodp, up, un, vp, vn, scratch);
	  else if (4 * un < 7 * vn)
	    mpn_toom32_mul (prodp, up, un, vp, vn, scratch);
	  else
	    mpn_toom42_mul (prodp, up, un, vp, vn, scratch);

It might even loop around a toom42 first.

(I think I'll redo the "New Toom multiplication code for unbalanced
operands" diagrams but also make a grayscale of time(best)/time(mpn_mul)
which hides behind the coloured diagram.  I am not a web expert, but if
the coloured diagram is a hyperlink to the grayscale diagram, one should
be able to flip between them and get the whole picture.)

-- 
Torbjörn
Please encrypt, key id 0xC8601622