Alternative div_qr_1

Torbjorn Granlund tg at gmplib.org
Wed Jun 16 14:53:14 CEST 2010


nisse at lysator.liu.se (Niels Möller) writes:

  Comments appreciated.
  
I made some changes, including replacing the 'cps' function.  Please
look at them.  See attached.

About the inner loop, couldn't 'sub r2,r2' + 'and B2modb,r2' be replaced
by 'mov $0,r2' + 'cmovc B2modb,r2' for a slightly more shallow path (but
perhaps not the critical path)?

It seems r2 and t0 could share register, if 'mov t0, r0' were moved
somewhat earlier.  But perhaps that messes up the pipeline?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: mod_1_1-nisse.asm
Type: application/octet-stream
Size: 4284 bytes
Desc: not available
URL: <http://gmplib.org/list-archives/gmp-devel/attachments/20100616/f999e229/attachment.obj>
-------------- next part --------------

-- 
Torbjörn


More information about the gmp-devel mailing list