gcd_22

Sun Aug 25 14:33:18 UTC 2019

Ciao,

Il Dom, 25 Agosto 2019 9:34 am, Niels Möller ha scritto:
> tg at gmplib.org (Torbjörn Granlund) writes:

>> Should we have gcd_33, gcd_44, and gcd_55 also?

> I think it would be more beneficial to optimize hgcd2.

> E.g., 3-limb gcd should typically boil down to one hgcd2, a couple of
> 3:1 mul_1/addmul_1, and one call to gcd_22. It's not obvious that a
> gcd_33 using the binary algorithm will be faster.

On shell I get:
$ tune/speed -CD -s16-64 -t48 mpn_gcd_11 mpn_gcd_22
overhead 6.84 cycles, precision 10000 units of 2.86e-10 secs, CPU freq
3500.09 MHz
           mpn_gcd_11    mpn_gcd_22
16         (#78.3223)    (134.4004)
64            #4.0877        7.8639

$ tune/speed -CD -s1-4 mpn_gcd
overhead 6.83 cycles, precision 10000 units of 2.86e-10 secs, CPU freq
3500.09 MHz
              mpn_gcd
1          (350.7188)
2            519.4961
3           3465.1028
4           2168.9167

If I'm not reading this timings wrongly, this means that with the current
code (disregarding the overhead, for those 64-bits limbs)
the bits in the limb 1 require 4 cycles each;
the bits in the limb 2 require 8 cycles each;
the bits in the limb 3 require 54 cycles each;
the bits in the limb 4 require 33 cycles each...

On the same machine, with ABI=32, gcd_22.c is used:

$ tune/speed -CD -s8-32 -t24 mpn_gcd_11 mpn_gcd_22
overhead 6.86 cycles, precision 10000 units of 2.86e-10 secs, CPU freq
3500.09 MHz
           mpn_gcd_11    mpn_gcd_22
8          (#45.9492)    (126.3672)
32            #4.5485       17.2290

$ tune/speed -CD -s1-4 mpn_gcd
overhead 6.85 cycles, precision 10000 units of 2.86e-10 secs, CPU freq
3500.09 MHz
              mpn_gcd
1          (270.6172)
2            538.6328
3           1797.6500
4           1258.4594

the bits in the limb 1 require 4.5 cycles each;
the bits in the limb 2 require 17 cycles each;
the bits in the limb 3 require 56 cycles each;
the bits in the limb 4 require 39 cycles each...

It really seem that the _33 case need some work.

Ĝis,
m

-- 
http://bodrato.it/papers/