gcd_22

Torbjörn Granlund tg at gmplib.org
Tue Aug 27 18:26:06 UTC 2019


Some cleanups and tweaks later.  The gcd_33 based on this, compiled with
gcc 8.3, runs at 30 cycles per iteration.  (Note, not cycles per bit!)

My best gcd_33 in assembly runs at 10 cycles per iteration.

The former uses memory based operands.  The latter keeps everything in
registers.

If we wrote an assembly variant of this, and inlined sub_3 and rshift_3,
I expect it to run at about 15 cycles per iteration.

(Timings are for AMD Ryzen.)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: gcd-mpn.c
Type: application/octet-stream
Size: 3438 bytes
Desc: not available
URL: <https://gmplib.org/list-archives/gmp-devel/attachments/20190827/fc1f8c23/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: x64-mpn_N.asm
Type: application/octet-stream
Size: 765 bytes
Desc: not available
URL: <https://gmplib.org/list-archives/gmp-devel/attachments/20190827/fc1f8c23/attachment-0001.obj>
-------------- next part --------------

-- 
Torbj?rn
Please encrypt, key id 0xC8601622


More information about the gmp-devel mailing list