3-prime FFT

Torbjörn Granlund tg at gmplib.org
Thu Jul 16 11:33:24 UTC 2015


paul zimmermann <Paul.Zimmermann at inria.fr> writes:

  on https://hal.archives-ouvertes.fr/hal-01022383, page 27, table 16, the
  3-prime FFT implemented in Mathemagix is faster than GMP for 2^23 to 2^25 bits.
  
Apparently, they do, but just about 10%...  Not impressed.  :-)
  
We have small primes FFT code in several variants, perhaps Niels' being
the most mature.  Alas, no work has been done since 2008.

The new GMP code "beats GMP" too, in particular when SSFFT's
coefficients approach the L1d cache size.

(It appears that narrow multiplication as provided by SIMD instructions
does not hurt arithmetic mod say (p1, p2, p3, p4) where all p:s are 32
bits compared to twice as wide arithmetic mod (p1*p2, p3*p4).)


-- 
Torbjörn
Please encrypt, key id 0xC8601622


More information about the gmp-devel mailing list