Anomaly in mpn_sqrtrem and mpn_rootrem
bodrato at mail.dm.unipi.it
bodrato at mail.dm.unipi.it
Fri Jul 3 05:06:41 UTC 2015
Ciao,
Il Mer, 24 Giugno 2015 9:53 am, Torbjörn Granlund ha scritto:
> bodrato at mail.dm.unipi.it writes:
> But the code I'm experimenting with is not ready yet, it supports only
> even sizes currently.
>
> That's another 10% speedup.
Yes... unluckily I did not have the time to adapt it to odd sizes, but I
decided to push it, so that anyone can look into it and comment or improve
it.
The speed of revision 16730:
$ tune/speed -cp100000000 -s4093-4098 mpn_sqrt mpn_root.2 mpn_sqrtrem
overhead 5.90 cycles, precision 100000000 units of 2.86e-10 secs, CPU freq
3500.08 MHz
mpn_sqrt mpn_root.2 mpn_sqrtrem
4093 #2120836.00 2286161.23 2450569.02
4094 #2107173.02 2358206.40 2507010.17
4095 #2126777.63 2392313.42 2458527.34
4096 #2172714.17 2373645.18 2470476.63
4097 #2176120.28 2358870.39 2542117.51
4098 #2193557.39 2374669.59 2513122.24
The speed with the new code (current repo):
$ tune/speed -cp100000000 -s4093-4098 mpn_sqrt mpn_root.2 mpn_sqrtrem
overhead 5.86 cycles, precision 100000000 units of 2.86e-10 secs, CPU freq
3500.08 MHz
mpn_sqrt mpn_root.2 mpn_sqrtrem
4093 #2103034.46 2307752.77 2442488.63
4094 #1831188.02 2367552.04 2469309.74
4095 #2112866.69 2407388.57 2453779.52
4096 #1849300.02 2377781.13 2416991.81
4097 #2144198.04 2376456.38 2492709.77
4098 #1853563.19 2379253.13 2469931.56
... yes, I know, we really need to improve also odd sizes...
> My guess is that a division-free iteration would give another 10%, and
> then using David's mulmid in that code would improve things by...10%.
Maybe...
Best regards,
m
--
http://bodrato.it/
More information about the gmp-devel
mailing list