"balanced" unbalanced toom22
nisse at lysator.liu.se
Thu Aug 7 15:18:50 CEST 2008
David Harvey <dmharvey at nyu.edu> writes:
> The existing toom22 breaks it up as 100x100 + 100x100 + 100x20. My
> experimental code does it as 100x100 + 100x60 + 100x60.
So the question is, what's faster, two 100x60, or 100x100 and 100x20.
Or more generally,
* two n x (m+k)/2, vs
* n x m and m x k
For schoolbook range, it's the same amount of work. I have no good
guess what whould happen with subquadratic multiplication.
It may be more enlightning to investigate (both theoretically and
practically) what happens with the only *slightly* unbalanced case,
say 200x198 or more generally n x (n-2). If I get this example right,
I get for one level of Karatsuba
100x100 and 100x98 vs 2 100x99 (and one 100x100 which is the same in
and after another level
50x50 and 50x48 vs 2 50x49 (and an additional 7 50x50 in both cases).
By induction, this should propagate all the way down until we get
below the Karatsuba threshold, where there ought to be a very small
difference, if measurable at all.
More information about the gmp-devel