div_qr_1 interface
Torbjorn Granlund
tg at gmplib.org
Mon Oct 21 23:03:50 CEST 2013
nisse at lysator.liu.se (Niels Möller) writes:
The problem is the final use, where Q2 is added, with carry, to a
different register. It's tempting to replace
adc Q1I, Q2
with
sbb Q2, Q1I
and negated Q2, but I'm afraid that will get the sense of the carry
wrong. Do you see any trick to get that right without negating Q2
somewhere along the way?
Well, no.
> I might also be possible to replace the early loop "and" stuff by
> cmov.
Maybe, but the simple way to do conditional addition with lea + cmov
won't to, since we also need carry out.
Does it matter if we do
mov B2, r
and mask, r
or
mov $0, r
cmovc B2, r
?
The latter tends to be faster on AMD CPUs. Not sure about Intel.
--
Torbjörn
More information about the gmp-devel
mailing list