Toom symmetries

bodrato at bodrato at
Thu Jun 11 09:31:42 CEST 2009


> That's clever. To see if I understand this, let's consider the simplest
> four-point interpolation first. The number we're trying to construct

>   x = w0 + (w1 - w3) B + (w2 - w0) B^2 + w3 B^3
>     = -(w3 B - w0) + (w1 + w2 B) B + (w3 B - w0) B^2

Yes and no...

Yes, the last step is
x= x0 + w1 B^1 + w2 B^2 + w3 B^3 + w4 B^4 + ....
and we can write it as
x= x0 + (w1 + w2 B) B + (w3 + w4 B) B^3 + ....
and we can combine (w1 + w2 B) ... (wn + w(n+1) B) as soon as possible,
then operate on them as we would do on w1, w3, ... wn (operating also on
w2, w4,..., wn+1 with the same functions). We can recombine when the two
lines have the same elements, shifted by one, the right direction... and
we do not need to "read" the single values any more.
After this partial recomposition we can compute
w1 = (w1 - w3)/d; w2 = (w2 - w4)/d;
doing the following (half an operation saved)
(w1 + w2 B) = ((w1 + w2 B) - (w3 + w4 B))/d
We can "write" on the single value with linear operations: we can obtain
w2 -= winf*a, with (w1 + w2 B) -= winf*a*B; but not, viceversa, "read".

But no, I did not consider mixing also w0 and winfinity, for the general
high degree Toom.

> I don't fully understand the current toom32_mul, but maybe it's doing
> precisely this? The temporary allocation makes me suspicious, though:

Yes, basically _toom32_mul[1] use the same trick you described above, to
save half a subtraction. You avoid repeating (w3 - H w0), operating on the
low half of w3. w3 is vinf in the code, and w0 is v0; the code compute
once (Hv0-Lvinf) = [with your notation] (H w0 - L w3) = - (L w3 - H w0).

But no, _toom32_mul does not mix evaluation with the two other (mixed)
phases... maybe some more temporary allocation can be saved.

> The recursive call is to toom22_mul, which needs ~ 2n limbs of scratch
> space, according to toom22_mul_itch, and this area is beyond the
> scratch area that toom32_mul asks for.

Uhm, probably the test programs should reserve a "keep" limb at the end of
the temporary area, not only above the result area... And yes,
toom22_mul_itch should be corrected.




More information about the gmp-devel mailing list