sqrt algorithm

Marco Bodrato bodrato at mail.dm.unipi.it
Thu Aug 6 23:00:56 UTC 2015


Ciao,

On Mon, August 3, 2015 8:48 pm, Marco Bodrato wrote:
> Do we need a sqrlo_basecase? The D&C version of sqrlo would use a full
> squaring and a single mullo, so that the base_cases for sqrlo_dc are
> sqr_basecase, mul_basecase and mullo_basecase.
>
> Writing a working sqrlo based on the current mullo code should be easy.

I pushed an implementation for mpn_sqrlo, it is based on mpn_mullo.
I pushed also a primitive mpn_sqrlo_basecase, based on sqr_basecase.

The range where it is faster than sqrmod is narrow:

tune/speed-cs 1-9000 -f2 mpn_sqr mpn_sqrlo mpn_sqrmod_bnm1
overhead 5.84 cycles, units of 2.86e-10 secs, CPU freq 3500.08 MHz
              mpn_sqr     mpn_sqrlo mpn_sqrmod_bnm1
1              #15.58         19.88         36.04
2              #19.48         20.54         39.94
4              #51.43         56.01         81.91
8              139.29       #114.17        174.01
16             453.79       #308.36        469.08
32            1495.95       #974.61       1146.67
64            4805.27       3421.11      #3096.75
128          14815.55      10885.63      #8784.71
256          44312.53      34502.41     #24808.02
512         119390.17     109101.40     #72427.37
1024        310496.06     284290.99    #209454.00
2048        868589.83     784437.11    #467387.69
4096       2268668.51    2162067.98   #1014982.18
8192       4887540.14    4951380.77   #2237012.00

Anyway sqrlo should be improved, at least by tuning its internal threshold.

The current powlo needs sqrlo...

Regards,
m

-- 
http://bodrato.it/papers/



More information about the gmp-devel mailing list