Toom-3.5 (aka Toom-4x3 & Toom-5x2)

bodrato at bodrato at
Sun Feb 8 10:33:51 CET 2009

Dear developers,

I wrote the code for the unbalanced Toom-3.5, id est:
mpn_toom_interpolate_6pts, mpn_toom43_mul, mpn_toom52_mul.

You can find it on my web page:

I developed and tested it on a Pentium3, but it contains also UNTESTED
code for architectures where HAVE_NATIVE_mpn_rsh1add_n and
HAVE_NATIVE_mpn_addsub_n are defined... If someone can test it on other
processors too, I'll be happy.

Is it useful? If we want to use Toom-4 with unbalanced operands, we have
the 3 flavours 4x4, 5x3 and 6x2; we have to choose among the three
according to the degree of unbalance.

For balanced operands: (e.g. 560x560)
we obviously use 4x4.

For slightly unbalanced: (600x520)
4x4 remains the best choice.

Then we reach a limit: (640x480)

Above that limit: (680x440)
we can start using 5x3...

The same happen between 5x3 and 6x2. Using the same example as before, the
case 800x320 is a border case where both 5x3 and 6x2 fail. It is not a
surprise to notice that the two cases of failure of unbalanced Toom-4
perfectly adapt to Toom-3.5, respectively to the 4x3 and to the 5x2
This fact is natural: Toom-4 fails for those case because one of the
products nullify, the solution is to fall back on the method with one
interpolation point less, i.e. Toom-3.5.

I conclude that, if we use Toom-4 for unbalanced operands, we _need_
Toom-3.5 for corner cases, and we can use it also to avoid borderline
situations (when we split in (N*n+s)x(M*n+t), the borderline case s+t <= n
imposes irksome coding).
This is general, Toom-2.5 (3x2) should handle borderline situation between
Toom-3 flavours (3x3, 4x2); Toom-3 handle corner cases for Toom-3.5 (4x3.
5x2)... and so on.

Let me know if you test the code.
Marco Bodrato


More information about the gmp-devel mailing list