caching of transforms used for large multiplications
tg at gmplib.org
Tue Jun 11 23:23:43 CEST 2013
We don't yet have any transform-only interface in GMP, but this will
probably change at some point.
The current FFT code uses coefficient rings mod 2^m+1, as per the
Schönhage-Strassen algorithm. In this algorithm, m = O(sqrt(n)) where n
= O(log(a) + log(b)) for multiplication of a and b.
This setting means that while the transform operations are cheap (O(n
log n) total work) the dyadic products are not; they dominate
computation time both in theory and in practice.
This mean two things, if we assume a is invariant.
1. Pre-computing FFT(a) will not affect the leading complexity term.
2. One will need to pre-compute FFT(a'0), FFT(a'1), ... for the
coefficient of the transformed polynomial a'(x).
For 2, if these coefficients are not large enough for FFT, one should
compute the "Toom transform" for the toom multiply chain to be used.
See Alberto Zanoni's work in the area.
Incidentally, my pet FFT variation splits a and b into coefficients of a
few words, then computes a FFT over a few word-size prime rings, then
uses CRT. We have made several implementations of this over the years,
the latest one by Niels. Since this is a small ring transform, the
dyadic products will take time O(n log log n) while the transforms will
take O(n log n log log n ...). This provides more benefit from
pre-transforming an invariant operand (and furthermore avoids the
complication of recursive transforms of dyadic product coefficients).
This FFT code is not yet merged into the main GMP repo (but it is
available since several years at http://gmplib.org:8000/gcd-nisse/).
More information about the gmp-devel