New mulmod_bknp1
Marco Bodrato
bodrato at mail.dm.unipi.it
Tue Feb 15 11:48:25 CET 2022
Ciao,
I pushed some new functions to compute the product (and square) modulo
B^nn+1, for small values of the exponent nn. For large values the
already available fft_mul should be used.
The new functions actually require nn=k*n, where k can be 3 or in {5, 7,
13, 17}.
Also k=11 can be supported, but it's currently disabled.
I added also tune/speed support, but the problem with the analysis is
that the function is defined for some values only...
Here are some results:
@shell ~/gmp-repo]$ /var/tmp/bodrato/gmp/hg/build/tune/speed -Cr -s
75-85\,240-250\,725-735 mpn_mul_n mpn_mulmod_bknp1
overhead 7.15 cycles, precision 10000 units of 2.86e-10 secs, CPU freq
3500.09 MHz
mpn_mul_n mpn_mulmod_bknp1
75 159.4800 #0.9715
76 #167.2895 n/a
77 #167.1558 1.0793
78 193.8462 #0.7311
79 #169.8734 n/a
80 182.1500 #0.8597
81 196.0617 #0.7721
82 #170.8780 n/a
83 #170.8916 n/a
84 169.2024 #0.9344
85 188.8235 #0.7898
240 258.7792 #0.7474
241 #254.4274 n/a
242 266.2355 #0.9051
243 260.2798 #0.7414
244 #256.3893 n/a
245 256.8449 #0.8595
246 255.7073 #0.7915
247 257.0324 #0.9904
248 #257.6532 n/a
249 257.2410 #0.7925
250 258.1480 #0.8535
725 378.2538 #0.8652
726 377.4793 #0.8362
727 #376.6836 n/a
728 376.2060 #0.9164
729 379.7462 #0.7682
730 379.1068 #0.8727
731 379.0109 #0.9805
732 377.9754 #0.8285
733 #379.6958 n/a
734 #377.6403 n/a
735 376.0204 #0.8094
As you can see, the new functions are particularly effective, when there
are many factors "3" in the size...
Integrating this in the current mod_bnm1 is easy, but the effect would
be to accelerate the operations only for some sizes. This may add
"noise" to the thresholds system. Some more experiments are needed.
In the meanwhile, the base code will be tested by the great testing
system of GMP :-)
Ĝis,
m
More information about the gmp-devel
mailing list