Alternative div_qr_1

Torbjorn Granlund tg at gmplib.org
Mon Jun 21 19:23:18 CEST 2010


nisse at lysator.liu.se (Niels Möller) writes:

  nisse at lysator.liu.se (Niels Möller) writes:
  
  > seems to be compiled to a branch rather than a cmov by gcc-4.3.2. Maybe
  > gcc-4.4.4 or gcc-4.5.0 is smarter.
  
  Now I've tested it on k7, using gcc-4.4.4.
  
    1500           8.1747       #8.1527
  
  I don't understand why the normalized case seem to have more expensive
  precomputation (but I haven't looked at the code).

So, it seems like the k7 assembly code needs to be rewritten, to improve
the speed for small operands.

Did you use the recent C and assembly code which I pushed just a wekk
ago for this comparison?  I think the cps function is much better there
than in previous files.

-- 
Torbjörn


More information about the gmp-devel mailing list