Toom symmetries
David Harvey
dmharvey at cims.nyu.edu
Tue Jun 9 23:19:17 CEST 2009
On Jun 9, 2009, at 4:11 PM, Niels Möller wrote:
> The simplest example, 4 points,
>
> w0 = f(0)
> w1 = f(1)
> w2 = f(-1)
> w3 = f(oo)
>
> corresponds to the matrix
>
> 1 0 0 0
> 1 1 1 1
> 1 -1 1 -1
> 0 0 0 1
>
> After addsub/2, we have
>
> 1 0 0 0
> 0 1 0 1
> 1 0 1 0
> 0 0 0 1
Hmmm... sorry to interrupt your train of thought, this gives me an
idea. Maybe you can delay some of the division by two until after
part of the recomposition?
i.e.: instead of addsub/2, just do addsub, then you have
1 0 0 0
0 2 0 2
2 0 2 0
0 0 0 1
i.e. if the output poly coefficients are c0, c1, c2, c3, you have
c0
2c1 + 2c3
2c0 + 2c2
c3
Now start doing recomposition:
|=== 2c1 + 2c3 ===|
|=== 2c0 + 2c2 ===|
Then divide by 2:
|==== c1 + c3 ====|
|==== c0 + c2 ====|
Then continue as usual, i.e. subtract c3, c0 from right places, add
them at beginning and end (and also use the hi(c0) - lo(c3)
redundancy trick...), to get
|====== c0 =======|
|====== c1 =======|
|====== c2 =======|
|====== c3 =======|
I haven't checked the operation count, but it seems plausible that
something like this could save half a shift.
> In total 12 subtractions, 2 left shifts, 5 right shifts and 2
> divisions. This actually not so bad compared to the current code,
> there's just one more left shift (and then evaluation obviously also
> needs more shifts). Using the four points corresponding to a = sqrt(B)
> = 2^{GMP_NUMB_BITS /2} might also be worth investigating, in this case
> the same reduction procedure will involved only division by (B-1).
I did try the 2^(GMP_NUMB_BITS/2) idea at some stage --- one
difficulty is that the pointwise multiplications get slightly bigger.
david
More information about the gmp-devel
mailing list