Anomaly in mpn_sqrtrem and mpn_rottrem
bodrato at mail.dm.unipi.it
bodrato at mail.dm.unipi.it
Tue Jun 23 22:06:32 UTC 2015
Ciao,
Il Sab, 13 Giugno 2015 11:14 am, Torbjörn Granlund ha scritto:
> bodrato at mail.dm.unipi.it writes:
> I wrote a simple patch (it touches very few lines) that allows skipping
> the final squaring when mpn_sqrtrem is called with a NULL argument and
> Nice speedup!
>
> I suppose we really ought to add a limb also for even sizes (at least
I pushed a patch working for both even and odd sizes.
The timings before the patch:
$ tune/speed -c -p100000000 -s 7-100000 -f2.8 mpn_sqrt mpn_sqrtrem
overhead 5.84 cycles, precision 100000000 units of 2.86e-10 secs, CPU freq
3500.08 MHz
mpn_sqrt mpn_sqrtrem
7 #497.76 498.36
19 1415.46 #1409.75
53 #3598.58 3599.66
148 #12554.66 12557.10
414 63269.61 #62597.68
1159 #329710.92 329812.60
3245 1632052.49 #1631323.93
9086 7404016.67 #7368246.67
25440 #28848619.50 28911118.50
71232 100379740.00 #100338201.00
And after pushing it:
$ tune/speed -c -p100000000 -s 7-100000 -f2.8 mpn_sqrt mpn_sqrtrem
overhead 5.85 cycles, precision 100000000 units of 2.86e-10 secs, CPU freq
3500.08 MHz
mpn_sqrt mpn_sqrtrem
7 #440.62 502.41
19 #1175.96 1396.98
53 #2990.28 3585.21
148 #10800.62 12505.17
414 #50732.77 62712.75
1159 #271406.46 329632.92
3245 #1411601.87 1634032.88
9086 #6373140.24 7373087.00
25440 #25487553.00 28958413.00
71232 #90226456.00 100523451.00
To be honest, it's possible to further speed-up mpn_sqrt, replacing the
final divrem with a div_q. Expected timings follow:
$ tune/speed -c -p100000000 -s 9086-100000 -f2.8 mpn_sqrt mpn_sqrtrem
overhead 5.84 cycles, precision 100000000 units of 2.86e-10 secs, CPU freq
3500.08 MHz
mpn_sqrt mpn_sqrtrem
9086 #5603297.21 7359423.00
25440 #22752715.40 28995848.50
71232 #81207275.50 100360485.00
But the code I'm experimenting with is not ready yet, it supports only
even sizes currently.
Regards,
m
--
http://bodrato.it/papers/
More information about the gmp-devel
mailing list