gcd_22

Niels Möller nisse at lysator.liu.se
Fri Aug 16 06:32:24 UTC 2019


tg at gmplib.org (Torbjörn Granlund) writes:

> nisse at lysator.liu.se (Niels Möller) writes:
>
>   Attached is a patch to move the current (slow) gcd_2 to a separate file, rename
>   it mpn_gcd_22, change argument and return type, and add a basic test.
>
> Nice!

Pushed now.

> Speed support would be very welcome!

Done! To get to measure only the gcd_22 loop, I reused the speed macros
for gcd_11, but call gcd_22 (al, al, bl, bl), in effect premultiplying
both inputs with a common factor (B+1). On my machine,

$ ./tune/speed -s 1-64 -t 3 -C mpn_gcd_11 mpn_gcd_22
overhead 4.05 cycles, precision 10000 units of 5.87e-10 secs, CPU freq 1703.77 MHz
           mpn_gcd_11    mpn_gcd_22
1             #9.9349       11.4688
4             #3.7217        6.5098
7             #3.6897        6.7366
10            #3.8477        7.9195
13            #4.0583        9.3407
16            #4.1460        9.9268
19            #4.4428       10.8306
22            #4.3512       10.9613
25            #4.3475       10.9706
28            #4.3407       11.3027
31            #4.3135       10.9393
34            #4.2116       11.6335
37            #4.3125       11.5507
40            #4.1918       11.9219
43            #4.2716       10.2703
46            #3.9071       10.2429
49            #3.9190       10.2698
52            #4.0637       10.3419
55            #4.0631       11.6783
58            #4.1228       11.8280
61            #4.0512       10.3701
64            #3.8345       10.3486

So the gcd_22 code with all the branches needs about 11 cycles per input
bit. gcd_11 is coreihwl/gcd_11.asm in this build.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.


More information about the gmp-devel mailing list