Toom testing, and unbalanced recursive calls in toom22.

Mon Oct 19 15:56:43 CEST 2009

I'm working on improved testcode for the toom functions, and I've
discovered the following problem with toom22:

Say mpn_toom22_mul is called with inputs of size an = 178, bn = 89.
Then this will be split in pieces as follows,

  a = a0 + a1 * B^89
  b = b0 + b1 * B^89

where a0, a1 and b0 are of 89 limbs each, and b1 is shorter, only 29
limbs.

Then there's the unbalanced recursive call to compute a1 * b1. Since
the size of b1 is larger than the Karatsuba threshold (20 on the test
machine), it recurses to toom22_mul; this is the logic of the
TOOM22_MUL_MN_REC macro. But toom22_mul can't handle a 89 x 29
product, and things fall apart.

Some possible solutions:

1. Have TOOM22_MUL_MN_REC always use basecase for too unbalaned
   products (and trust that in practice toom22 will not be called for
   sizes where this is a poor choice). Also, for size 2n x n+1, the
   most unbalanced case mpn_toom22_mul can handle, that function is
   probably not the best choice.

2. Have TOOM22_MUL_MN_REC choose between toom22, toom32 and basecase,
   using some conditions.

3. Let TOOM22_MUL_MN_REC use the general mpn_mul function for
   unbalanced multiplicaton. This has implications on the use of
   scratch space (as long as mpn_mul doesn't use an itch/scratch
   interface, scratch allocated for toom22 will be wasted. And if
   mpn_mul is eventually itchified, this gives a rather unpleasant
   circular dependency.

I think all the other toom functions call mpn_mul for the unbalanced
recursive product, but that overhead is more noticable for toom22, I
guess.

What's reasonable to do?

Regards,
/Niels