gcd_11 without abs
Niels Möller
nisse at lysator.liu.se
Thu Nov 19 15:20:17 UTC 2020
Hi,
I had one idea which I wonder if it's useful on some architecture. I
think all current variants of the binary gcd used for gcd_11 try to
replace (a,b) with min(a,b), abs(a-b), and then take out powers of two.
One could get rid of the absolute value by letting one of the working
values be negative. Something like this, in pseudo code
b = - b
while ( (sum = a + b) != 0)
{
if (sum > 0) a = sum;
else b = sum;
}
Note that we always have a > 0 and b < 0, so one doesn't need to
interpret high bit as a sign bit. I had a look at the current code for
arm (v6t2) and x86_64 (core2), and there I don't see any clear
improvement. For ARM, it would be something like
neg v0
...
top:
rbit t, s
clz t, t
lsr s, s, t
movcs u0, s
movcc v0, s
adds s, u0, v0
bne L(top)
I don't expect that to be faster than the current code (but I haven't
tried it out). The current code does the absolute value neatly using
rsbcc r3, r3, #0.
For x86_64:
neg v0
...
top:
cmovc s, v0
cmovnc s, v0
lea (u0, v0), s
bsf s, %rcx
shr R8(%rcx), s
mov u0, tmp
add v0, tmp C For carry only
jnz top
That's one instruction less than the current code, but looks quite
clumsy. It's annoying that bsf clobbers the carry flag. Since we can't
use carry from the first addition anyway, we can use lea for that and
save a mov. But then we still need a mov + add to get the carry later
on; here negation works against us, since that makes the compare
instruction useless.
Maybe there's some other architecture where negation is a clear win?
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel
mailing list