Alternative div_qr_1
Torbjorn Granlund
tg at gmplib.org
Mon Jun 21 19:23:18 CEST 2010
nisse at lysator.liu.se (Niels Möller) writes:
nisse at lysator.liu.se (Niels Möller) writes:
> seems to be compiled to a branch rather than a cmov by gcc-4.3.2. Maybe
> gcc-4.4.4 or gcc-4.5.0 is smarter.
Now I've tested it on k7, using gcc-4.4.4.
1500 8.1747 #8.1527
I don't understand why the normalized case seem to have more expensive
precomputation (but I haven't looked at the code).
So, it seems like the k7 assembly code needs to be rewritten, to improve
the speed for small operands.
Did you use the recent C and assembly code which I pushed just a wekk
ago for this comparison? I think the cps function is much better there
than in previous files.
--
Torbjörn
More information about the gmp-devel
mailing list