Anomaly in mpn_sqrtrem and mpn_rootrem

bodrato at mail.dm.unipi.it bodrato at mail.dm.unipi.it
Fri Jul 3 05:06:41 UTC 2015


Ciao,

Il Mer, 24 Giugno 2015 9:53 am, Torbjörn Granlund ha scritto:
> bodrato at mail.dm.unipi.it writes:
>   But the code I'm experimenting with is not ready yet, it supports only
>   even sizes currently.
>
> That's another 10% speedup.

Yes... unluckily I did not have the time to adapt it to odd sizes, but I
decided to push it, so that anyone can look into it and comment or improve
it.

The speed of revision 16730:
$ tune/speed -cp100000000 -s4093-4098 mpn_sqrt mpn_root.2 mpn_sqrtrem
overhead 5.90 cycles, precision 100000000 units of 2.86e-10 secs, CPU freq
3500.08 MHz
             mpn_sqrt    mpn_root.2   mpn_sqrtrem
4093      #2120836.00    2286161.23    2450569.02
4094      #2107173.02    2358206.40    2507010.17
4095      #2126777.63    2392313.42    2458527.34
4096      #2172714.17    2373645.18    2470476.63
4097      #2176120.28    2358870.39    2542117.51
4098      #2193557.39    2374669.59    2513122.24

The speed with the new code (current repo):
$ tune/speed -cp100000000 -s4093-4098 mpn_sqrt mpn_root.2 mpn_sqrtrem
overhead 5.86 cycles, precision 100000000 units of 2.86e-10 secs, CPU freq
3500.08 MHz
             mpn_sqrt    mpn_root.2   mpn_sqrtrem
4093      #2103034.46    2307752.77    2442488.63
4094      #1831188.02    2367552.04    2469309.74
4095      #2112866.69    2407388.57    2453779.52
4096      #1849300.02    2377781.13    2416991.81
4097      #2144198.04    2376456.38    2492709.77
4098      #1853563.19    2379253.13    2469931.56

... yes, I know, we really need to improve also odd sizes...

> My guess is that a division-free iteration would give another 10%, and
> then using David's mulmid in that code would improve things by...10%.

Maybe...

Best regards,
m

-- 
http://bodrato.it/



More information about the gmp-devel mailing list