Wraparound multiplicaton vs mullo
Niels Möller
nisse at lysator.liu.se
Thu Oct 22 15:55:57 CEST 2009
Torbjorn Granlund <tg at gmplib.org> writes:
> Wow! Which are the largest n for which it is still faster than full
> multiplication?
Haven't tried, but I'd not be surprised if it's a little faster also
in the FFT range. As Marco said, since multiplication is superlinear,
therefore, CRT using a known factorization of the modulo should always
help. For example,
(B^{8n} - 1) = (B^n-1) (B^n+1) (B^{2n}+1) (B^{4n}+1)
Full multiplication would use an FFT of size 16n, cost
16 n (4 + log n)
Doing it with FFT wraparound at the toplevel would use FFT of size 8n,
cost
8 n (3 + log n)
Doing wraparound FFT multiplication mod one factor at a time would be
two FFT:s of size n, one of 2n and one of 4n,
2 n log n + 2 n (1 + log n) + 4 n (2 + log n)
= 8 n (3/4 + log n)
How it really works out in practice will depend on how the linear
saving in more shallow FFT:s (a factor 4 in the linear term above)
compares to the cost of the wrapping of inputs and the CRT
reconstructions.
/Niels
More information about the gmp-devel
mailing list