New mulmod_bknp1

Marco Bodrato bodrato at mail.dm.unipi.it
Fri Feb 18 20:10:45 CET 2022


Ciao,

Il 2022-02-15 11:48 Marco Bodrato ha scritto:
> I pushed some new functions to compute the product (and square) modulo
> B^nn+1, for small values of the exponent nn. For large values the
> already available fft_mul should be used.

A possible use is replacing the last-level point-wise multiplication of 
mul_fft.
I tried to do that.

> The new functions actually require nn=k*n, where k can be 3 or in {5,
> 7, 13, 17}.

This means that it can be used only when mul_fft falls into such a size.
The results is that for some sizes there is an effect,
for other sizes there is none.
Not optimal, but an improvement anyway...

I attach a graph (fft_with_bknp1.png) obtained plotting the output of
tune/speed -s 256-131072 -t128 mpn_mul_fft
before (_old) and after (_new) the patch.


Unfortunately the current tuning criteria does not interact well with 
the added code.
If the library is re-tuned, then there are sizes where a worst 
combination is chosen and the effect is a slowdown. I attach another 
graph (fft_retuned.png) with the timings obtained as above, the 
"_retuned" line was measured after re-tuning (make -C tune tune).

I don't know how the generation of FFT_TABLE works, should we change it 
somehow?

Ĝis,
m
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fft_with_bknp1.png
Type: image/png
Size: 3731 bytes
Desc: not available
URL: <https://gmplib.org/list-archives/gmp-devel/attachments/20220218/300c0466/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fft_retuned.png
Type: image/png
Size: 4634 bytes
Desc: not available
URL: <https://gmplib.org/list-archives/gmp-devel/attachments/20220218/300c0466/attachment-0001.png>


More information about the gmp-devel mailing list