nisse at lysator.liu.se
Fri Oct 19 12:21:38 CEST 2012
Torbjorn Granlund <tg at gmplib.org> writes:
> I believe Niels Möller made some experiments with saving a "transform"
> from a recursive chain of Toom.
If I remember the benchmarking, I think I got a speedup of 5%-10% or so
for toom2 (aka Karatsuba). It's in the mail archives.
> When we merge the small primes (SP) FFT (which is of the latter variety)
> we expect to get a cutoff point from SS FFT to SP FFT when there is no
> invariance, and a much lower one (perhaps 0) when there is invariance.
I think it's possible that SP fft can beat SS completely. David's clever
inner loop is much faster than the ones I used (and which are in the
http://gmplib.org:8000/gcd-nisse/ repo), 6 cycles/butterfly operation vs
9 or so for my code.
> And yes, we would like to support addition at the transform side of
> things. It is not clear how to define things as to avoid overflow,
Interested people can have a look at the interface for the fourier_*
I think I have written some docs as well, but I'm not sure where, maybe
in old postings to this list. This interface supports addition and
subtraction of transformed values, and when choosing the fft parameters,
you specify worst case coefficient growth from such operations, to get
sufficient margin in reconstruction.
This interface is used in mpn_matrix22_mul in the same repo, see the
function mpn_matrix22_mul_fft in
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel