Bounding the recursion depth of toom22_mul

Fri Jun 12 12:41:09 CEST 2009

The storage need for toom22_mul includes a log term. One simple (but
still quite conservative) bound is

  2*(n + k)

where n is the number of limbs in the larger factor (an), and k is the
recursion depth. It's tempting to bound k by

  ceil(log_2(ceil(TOOM33_MUL_THRESHOLD / TOOM22_MUL_THRESHOLD)))

which usually is a compile time constant. Then for safety one should
probably also add an

  ASSERT_ALWAYS(BELOW_THRESHOLD(an, TOOM33_MUL_THRESHOLD));

at the start of toom22_mul. I see some potential problems with this,
though.

1. The case of tuneup and fat binaries, where the thresholds are not
   constant. log_2 (TOOM33_MUL_THRESHOLD_LIMIT) might be of some use,
   but I'm not sure how that is intended to be used. Even better if
   there also was a *lower* bound on TOOM22_MUL_THRESHOLD, so one
   could in effect use MAX(TOOM33_MUL_THRESHOLD) /
   MIN(TOOM22_MUL_THRESHOLD).

2. The recurson depth is related to the size of the larger input,
   while I think it is common to apply the thresholds looking at the
   smaller input. The above bound would break down if toom22_mul is
   callled with an above the toom33 threshold and bn below it.

3. It makes it harder to include toom22 in benchmarks for larger
   sizes.

There may be other problems that I'm missing.

What's the maximum recursion depth in the usual configurations?

Other alternatives might be to not care, and let a multiplication
just above the Karatsuba threshold allocate some 128+ unused limbs. Or
we could handle the case of a recursion depth of 1 (recursive calls
all schoolbook) specially, if that is a common and important cases.

Regards,
/Niels