"balanced" unbalanced toom22

Thu Aug 7 15:18:50 CEST 2008

David Harvey <dmharvey at nyu.edu> writes:

> The existing toom22 breaks it up as 100x100 + 100x100 + 100x20. My
> experimental code does it as 100x100 + 100x60 + 100x60.

So the question is, what's faster, two 100x60, or 100x100 and 100x20.
Or more generally,

  * two n x (m+k)/2, vs

  * n x m and m x k

For schoolbook range, it's the same amount of work. I have no good
guess what whould happen with subquadratic multiplication.

It may be more enlightning to investigate (both theoretically and
practically) what happens with the only *slightly* unbalanced case,
say 200x198 or more generally n x (n-2). If I get this example right,
I get for one level of Karatsuba

  100x100 and 100x98 vs 2 100x99 (and one 100x100 which is the same in
                                  both cases)

and after another level

    50x50 and  50x48 vs 2  50x49 (and an additional 7 50x50 in both cases).

By induction, this should propagate all the way down until we get
below the Karatsuba threshold, where there ought to be a very small
difference, if measurable at all.

Regards,
/Niels