mpn_mulmod_bnm1
Niels Möller
nisse at lysator.liu.se
Wed Apr 2 17:15:11 UTC 2014
Torbjorn Granlund <tg at gmplib.org> writes:
> We should integrate the small primes FFT code,
> https://gmplib.org/repo/gcd-nisse/file/0d591aa7e02c/mpn/generic/mul_spfft.c
> in the next major GMP release. Perhaps this would make mpn_mulmod_bnm1
> trickier to maintain (at least as a cutting-edge primitive)?
I don't remember any fundamental difficulties.
> I am not sure what needs to be done to get mul_spfft.c in shape for
> general inclusion. I suppose a bunch of assembly implementations of the
> butterfly operations will be needed. Something more?
* Tuning.
* Tweaking the roots to compute mod 2^m+1 rather than 2^m-1 (since 2^m-1
is better handled by splitting in mulmod_bnm1).
* Implementing David Harvey's trick: slightly smaller primes, but
butterflies sped up from 9-10 or so cycles down to 6 (this was
benchmarked on Opteron, where 6 is the limit from multiplier
throughput).
* Integration with mulmod_bnm1.
* And more assembly coding.
> To lower the bar for inclusion, we could keep it disabled by default,
> and enable it only if some encouraging parameter is set in gmp-mparam.h.
Or an explicit ./configure time option.
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel
mailing list