Niels Möller nisse at
Wed Apr 2 17:15:11 UTC 2014

Torbjorn Granlund <tg at> writes:

> We should integrate the small primes FFT code,
> in the next major GMP release.  Perhaps this would make mpn_mulmod_bnm1
> trickier to maintain (at least as a cutting-edge primitive)?

I don't remember any fundamental difficulties.

> I am not sure what needs to be done to get mul_spfft.c in shape for
> general inclusion.  I suppose a bunch of assembly implementations of the
> butterfly operations will be needed.  Something more?

* Tuning. 

* Tweaking the roots to compute mod 2^m+1 rather than 2^m-1 (since 2^m-1
  is better handled by splitting in mulmod_bnm1).
* Implementing David Harvey's trick: slightly smaller primes, but
  butterflies sped up from 9-10 or so cycles down to 6 (this was
  benchmarked on Opteron, where 6 is the limit from multiplier

* Integration with mulmod_bnm1.

* And more assembly coding.

> To lower the bar for inclusion, we could keep it disabled by default,
> and enable it only if some encouraging parameter is set in gmp-mparam.h.

Or an explicit ./configure time option.


Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.

More information about the gmp-devel mailing list