mul_fft
Paul Zimmermann
Paul.Zimmermann at inria.fr
Wed Jun 16 07:14:25 UTC 2021
Hi,
with Samuel Vivien (in cc) we noticed two things related to the FFT code:
1) the use of mpn_add_n_sub_n is not activated by default in mul_fft.c.
It might give a small speedup in some cases.
2) the previous code to multiply integers (mpn_mul_fft_full) which did
two multiplications mod 2^N1+1 and 2^N2+1 has been replaced by a new
code which computes 2^(2N)-1 for 2N ~ N1+N2, using the fact that
2^(2N)-1=(2^N+1)(2^(N/2)+1)(2^(N/4)+1)...(2^(N/2^k)+1)(2^(N/2^k)-1).
However that new code is not always faster, for example on an Intel
Core i5-4590 with gcc 10.2.1:
Vanilla GMP 6.2.1:
$ ./speed -s 1000000,2000000,5000000,10000000 mpn_mul_n mpn_mul_n mpn_mul_n
mpn_mul_n mpn_mul_n mpn_mul_n
1000000 0.447985000 0.452133000 #0.447968000
2000000 #0.949739000 0.949996000 0.956713000
5000000 #2.669156000 2.678638000 2.677642000
10000000 #5.540204000 5.547306000 5.547152000
With --enable-old-fft-full (without WANT_ADDSUB):
1000000 0.448750000 0.451685000 #0.447571000
2000000 #0.941488000 0.942339000 0.947603000
5000000 #2.668528000 2.675208000 2.669147000
10000000 5.543246000 5.580915000 #5.538155000
Best regards,
Paul
More information about the gmp-devel
mailing list