FFT multiplication

Wed Sep 4 10:56:31 UTC 2019

       Hello,

those who perform large computations with GMP might be interested by the
GMP-based implementation of Schönhage-Strassen's large integer multiplication
algorithm that we wrote with Pierrick Gaudry and Alexander Kruppa in 2007 [1],
and that I have updated for GMP 6.1.2 [2]. It also now includes a tuning
script that you can run on your own machine.

Excerpt from README.mul_fft:

For reference, here is a comparison on a Intel(R) Core(TM) i5-4590 CPU at 3.30GHz
with gcc 8.3.0, turbo-boost disabled, GMP configured with --disable-shared.

With GMP 6.1.2:
$ ./speed -s 1000000,2000000,5000000,10000000 mpn_mul_n mpn_mul_n mpn_mul_n
overhead 0.000000002 secs, precision 10000 units of 3.04e-10 secs, CPU freq 3293.21 MHz
            mpn_mul_n     mpn_mul_n     mpn_mul_n
1000000    0.464635000  #0.463611000   0.464940000
2000000   #0.960899000   0.974004000   0.961607000
5000000   #2.723665000   2.735766000   2.725246000
10000000    5.637869000  #5.618580000   5.638739000

With the new mul_fft.c file and gmp-mparam.h.haswell:
$ ./speed -s 1000000,2000000,5000000,10000000 mpn_mul_n mpn_mul_n mpn_mul_n
overhead 0.000000002 secs, precision 10000 units of 3.04e-10 secs, CPU freq 3292.33 MHz
            mpn_mul_n     mpn_mul_n     mpn_mul_n
1000000   #0.344157000   0.348305000   0.352287000
2000000    0.870705000   0.870784000  #0.870605000
5000000   #2.320790000   2.335271000   2.333464000
10000000    4.977157000  #4.955029000   5.003099000

We thus get a speedup of 26% for 1000000 limbs, of 9% for 2000000 limbs,
15% for 5000000 limbs, and of 12% for 10000000 limbs.

Paul Zimmermann

[1] https://hal.inria.fr/inria-00126462
[2] https://members.loria.fr/PZimmermann/software/mul_fft-6.1.2.tgz