toom54

Tue Feb 14 20:09:24 CET 2012

nisse at lysator.liu.se (Niels Möller) writes:

  I think it might provide some insight to do benchmarks along fixed
  ratios. For each algorithm, there's a supported range. Take both end
  points and the midpoint. Along each of these three lines, benchmark for
  a range of sizes with a fixed ration, and see where the algorithm beats
  other algorithms which support the same ratio. There's some complication
  from the unbalanced calls (which one would want to use an optimal
  algorithm choice), but I hope that will have a fairly small influence on
  the results.

  It would also be interesting to prepare some graph showing for each
  function which functions it may use for the recursive calls (with a
  close-to-optimal algorithm choice). E.g., I suspect toom32 shouldn't
  call anything but basecase, toom32 and toom22.

Not so sure.  But we can decide to make it that.  It currently is
fastest for large operands, in a *narrow* space between 43 and 53, and
its subproducts will want 33.

To enable 54 in today's framework, it should probably be here:

	      else if (2 * un < 3 * vn)
		{
		  if (BELOW_THRESHOLD (vn, MUL_TOOM32_TO_TOOM43_THRESHOLD))
		    mpn_toom32_mul (prodp, up, un, vp, vn, scratch);
!		  else if (BELOW_THRESHOLD (vn, MUL_TOOM43_TO_TOOM54_THRESHOLD))
		    mpn_toom43_mul (prodp, up, un, vp, vn, scratch);
!		  else
!		    mpn_toom54_mul (prodp, up, un, vp, vn, scratch);
		}

But first MUL_TOOM43_TO_TOOM54_THRESHOLD needs to be measured and set to
some default.

-- 
Torbjörn