# Toom symmetries

David Harvey dmharvey at cims.nyu.edu
Tue Jun 9 23:19:17 CEST 2009

```On Jun 9, 2009, at 4:11 PM, Niels Möller wrote:

> The simplest example, 4 points,
>
>   w0 = f(0)
>   w1 = f(1)
>   w2 = f(-1)
>   w3 = f(oo)
>
> corresponds to the matrix
>
>   1  0  0  0
>   1  1  1  1
>   1 -1  1 -1
>   0  0  0  1
>
>
>   1  0  0  0
>   0  1  0  1
>   1  0  1  0
>   0  0  0  1

Hmmm... sorry to interrupt your train of thought, this gives me an
idea. Maybe you can delay some of the division by two until after
part of the recomposition?

1  0  0  0
0  2  0  2
2  0  2  0
0  0  0  1

i.e. if the output poly coefficients are c0, c1, c2, c3, you have

c0
2c1 + 2c3
2c0 + 2c2
c3

Now start doing recomposition:

|=== 2c1 + 2c3 ===|
|=== 2c0 + 2c2 ===|

Then divide by 2:

|==== c1 + c3 ====|
|==== c0 + c2 ====|

Then continue as usual, i.e. subtract c3, c0 from right places, add
them at beginning and end (and also use the hi(c0) - lo(c3)
redundancy trick...), to get

|====== c0 =======|
|====== c1 =======|
|====== c2 =======|
|====== c3 =======|

I haven't checked the operation count, but it seems plausible that
something like this could save half a shift.

> In total 12 subtractions, 2 left shifts, 5 right shifts and 2
> divisions. This actually not so bad compared to the current code,
> there's just one more left shift (and then evaluation obviously also
> needs more shifts). Using the four points corresponding to a = sqrt(B)
> = 2^{GMP_NUMB_BITS /2} might also be worth investigating, in this case
> the same reduction procedure will involved only division by (B-1).

I did try the 2^(GMP_NUMB_BITS/2) idea at some stage --- one
difficulty is that the pointwise multiplications get slightly bigger.

david

```