FFT multiplication
paul zimmermann
Paul.Zimmermann at inria.fr
Wed Sep 4 10:56:31 UTC 2019
Hello,
those who perform large computations with GMP might be interested by the
GMP-based implementation of Schönhage-Strassen's large integer multiplication
algorithm that we wrote with Pierrick Gaudry and Alexander Kruppa in 2007 [1],
and that I have updated for GMP 6.1.2 [2]. It also now includes a tuning
script that you can run on your own machine.
Excerpt from README.mul_fft:
For reference, here is a comparison on a Intel(R) Core(TM) i5-4590 CPU at 3.30GHz
with gcc 8.3.0, turbo-boost disabled, GMP configured with --disable-shared.
With GMP 6.1.2:
$ ./speed -s 1000000,2000000,5000000,10000000 mpn_mul_n mpn_mul_n mpn_mul_n
overhead 0.000000002 secs, precision 10000 units of 3.04e-10 secs, CPU freq 3293.21 MHz
mpn_mul_n mpn_mul_n mpn_mul_n
1000000 0.464635000 #0.463611000 0.464940000
2000000 #0.960899000 0.974004000 0.961607000
5000000 #2.723665000 2.735766000 2.725246000
10000000 5.637869000 #5.618580000 5.638739000
With the new mul_fft.c file and gmp-mparam.h.haswell:
$ ./speed -s 1000000,2000000,5000000,10000000 mpn_mul_n mpn_mul_n mpn_mul_n
overhead 0.000000002 secs, precision 10000 units of 3.04e-10 secs, CPU freq 3292.33 MHz
mpn_mul_n mpn_mul_n mpn_mul_n
1000000 #0.344157000 0.348305000 0.352287000
2000000 0.870705000 0.870784000 #0.870605000
5000000 #2.320790000 2.335271000 2.333464000
10000000 4.977157000 #4.955029000 5.003099000
We thus get a speedup of 26% for 1000000 limbs, of 9% for 2000000 limbs,
15% for 5000000 limbs, and of 12% for 10000000 limbs.
Paul Zimmermann
[1] https://hal.inria.fr/inria-00126462
[2] https://members.loria.fr/PZimmermann/software/mul_fft-6.1.2.tgz
More information about the gmp-discuss
mailing list