Wraparound multiplicaton vs mullo

Niels Möller nisse at lysator.liu.se
Thu Oct 15 14:42:14 CEST 2009


Torbjorn Granlund <tg at gmplib.org> writes:

> But it might be more tricky for the Schönhage-Strassen FFT (SSFFT)
> dyadic product.  It wants to compute mod x^n+1, not x^n-1 which is
> redicible over Q.

I'm not sure I understand why it does this.

Lets first look at what the FFT multiplication computes. When I have
implemented FFT, I let w be a primitive nth root of unity, and I
evaluate in powers of w, which then are roots of x^n - 1. Wrap-araound
is then naturally mod B^n - 1.

I'm told (but I haven't looked up the details) that one can also do FFT
to get wraparound mod B^n + 1. Then the roots of unity used must be
the roots of x^n + 1, which (in C) differ a factor exp(i pi / n) from
the roots of x^n - 1. This seems to be what the current mul_fft does.

Next, look at the coefficient ring. The only hard requirements are
that it has a root of unity of the right order, and that the number 2
is invertible.

In the ring Z_{2^n + 1}, the multiplicative order of 2 is 2 n (since
2^n = -1 (mod 2^n + 1), and (-1)^2 = 1). In the ring Z_{2^n - 1}, the
multiplicative order of 2 is n, since 2^n = 1 (mod 2^n -1).

So if we need a root of unity of order 2^k, and n = 2^j, then

  w = 2^{j - k + 1}  (j > k + 1)

is suitable when computing mod 2^n + 1, while

  w = 2^{j - k}      (j > k)

is suitable when computing mod 2^n - 1.

When doing FFT recursively, its obviously desirable to have the
coefficient ring match the wraparound arithmetic provided by FFT.
Current FFT code chooses mod 2^n + 1 arithmetic.

Is there any good reason why it wouldn't work just as well with 2^n -
1 arithmetic for both the result of the FFT mutliplication, and for
the coefficient ring?

The choice matters when coefficient size falls below the FFT range,
where it appears that multiplication mod 2^n - 1 is more efficient
than multiplication mod 2^n + 1.

In principle, I guess it would also be possible to let the FFT code
continue to do wraparound of the form B^n + 1, but choose coefficient
ring, Z_{2^n+1} or Z_{2^n-1}, depending on whether or not the
coefficient size is above or below the FFT threshold. These are tied
together only for convenience, any ring Z_n with n odd, large enough,
and with a suitable root of unity can work as the coefficient ring.

Regards,
/Niels


More information about the gmp-devel mailing list