Iterative Toom-Cook-2.5 (Toom32) method for unbalanced multiplication

Tue Dec 22 22:26:55 CET 2009

nisse at lysator.liu.se (Niels Möller) writes:

  Torbjorn Granlund <tg at gmplib.org> writes:

  > I suppose we could consider this for *small* m,n.  That's where it can
  > save some time.

  A start would be to do it for basecase, and make sure that the interface
  is good enough that one can get rid of the annoying copies in the un >>
  MUL_BASECASE_MAX_UN > vn case in mpn_mul.

That's not hard; new entry points in the assembly mul_basecase and a
limited amount of new code would do it.

For C, it will require more code duplication.

  > For large operands

  I'm not sure about savings here. During the work with small-prime fft,
  it seemed like the final O(n) reconstruction work was quite important.
  If one can do an extra addition on the fly at small extra cost there,
  one saves one pass over the data, for a * b + c operation. So I wouldn't
  be surprised if that can give a modest speedup to a * b + c
  operations in the low end of the fft range.

Easy enoughy to measure:

shell$ speed -s5000 mpn_add_n mpn_mul_n
            mpn_add_n     mpn_mul_n
5000     #0.000005560   0.002410749

We could speed things up be around staggering 0.2%. :-)

-- 
Torbjörn