Anomaly in mpn_sqrtrem and mpn_rottrem

bodrato at mail.dm.unipi.it bodrato at mail.dm.unipi.it
Tue Jun 23 22:06:32 UTC 2015


Ciao,

Il Sab, 13 Giugno 2015 11:14 am, Torbjörn Granlund ha scritto:
> bodrato at mail.dm.unipi.it writes:
>   I wrote a simple patch (it touches very few lines) that allows skipping
>   the final squaring when mpn_sqrtrem is called with a NULL argument and

> Nice speedup!
>
> I suppose we really ought to add a limb also for even sizes (at least

I pushed a patch working for both even and odd sizes.

The timings before the patch:

$ tune/speed -c -p100000000 -s 7-100000 -f2.8 mpn_sqrt mpn_sqrtrem
overhead 5.84 cycles, precision 100000000 units of 2.86e-10 secs, CPU freq
3500.08 MHz
             mpn_sqrt   mpn_sqrtrem
7             #497.76        498.36
19            1415.46      #1409.75
53           #3598.58       3599.66
148         #12554.66      12557.10
414          63269.61     #62597.68
1159       #329710.92     329812.60
3245       1632052.49   #1631323.93
9086       7404016.67   #7368246.67
25440    #28848619.50   28911118.50
71232    100379740.00 #100338201.00

And after pushing it:

$ tune/speed -c -p100000000 -s 7-100000 -f2.8 mpn_sqrt mpn_sqrtrem
overhead 5.85 cycles, precision 100000000 units of 2.86e-10 secs, CPU freq
3500.08 MHz
             mpn_sqrt   mpn_sqrtrem
7             #440.62        502.41
19           #1175.96       1396.98
53           #2990.28       3585.21
148         #10800.62      12505.17
414         #50732.77      62712.75
1159       #271406.46     329632.92
3245      #1411601.87    1634032.88
9086      #6373140.24    7373087.00
25440    #25487553.00   28958413.00
71232    #90226456.00  100523451.00

To be honest, it's possible to further speed-up mpn_sqrt, replacing the
final divrem with a div_q. Expected timings follow:

$ tune/speed -c -p100000000 -s 9086-100000 -f2.8 mpn_sqrt mpn_sqrtrem
overhead 5.84 cycles, precision 100000000 units of 2.86e-10 secs, CPU freq
3500.08 MHz
             mpn_sqrt   mpn_sqrtrem
9086      #5603297.21    7359423.00
25440    #22752715.40   28995848.50
71232    #81207275.50  100360485.00

But the code I'm experimenting with is not ready yet, it supports only
even sizes currently.

Regards,
m

-- 
http://bodrato.it/papers/



More information about the gmp-devel mailing list