GMP 4.3 multiplication performance

Tue Jun 2 14:40:58 CEST 2009

       Torbjörn,

> While working on multiplication improvements for GMP 4.4, I realized
> that we are doing something quite silly for unbalanced operands in GMP
> 4.3.  For some machines, GMP 4.3 might actually be substantially slower
> than GMP 4.2!
> 
> The current multiply code works roughly like this:
> 
>   if (operands have same size)
>     use mpn_sqr_n or mpn_mul_n
> 
>   if (schoolbook operand range)
>     use mpn_mul_basecase
> 
>   if (fft range)
>     use mpn_mul_fft_full
> 
>   else use toom42, toom32, or toom22, or some combination of these.
> 
> The problem is that for any unbalanced operands between the schoolbook
> range and the fft range, we will basically use karatsuba's algorithm,
> not any toom variant.  This is pretty awful except for smallish
> operands.

I don't understand: you should fall down to the "else" part above, which
tries toom42, toom32, or toom22.

Moreover what does mean "schoolbook operand range" and "fft range" for
unbalanced operands. Do you test the smaller size, the larger size, the
average of both?

What you really need is a two-dimensional threshold scheme. Marco Bodrato did
produce a corresponding graph some time ago.

Paul Zimmermann