tg at gmplib.org
Thu Oct 31 16:00:46 UTC 2019
nisse at lysator.liu.se (Niels Möller) writes:
Updated version below, without input_v and corresponding divisions. Also
one of the binary variants use binv. While debugging, I found that the
accumulated shift may well exceed GMP_LIMB_BITS (shouldn't be so
surprising), but I think it must be < 2 * GMP_LIMB_BITS. And there are a
few corner cases.
I guess it's possible to do all cofactor shifts as part of the loop, by
shifting only one bit per iteration, and making the subtraction
conditional on both nubmers being odd. That would be like a single-limb
variant of mpn_sec_invert (but with a stop condition to exit early).
Unclear if that can be a winner, with less progress per iteration in the
main loop. And it may produce a cofactor that's unnecessarily large in
the g > 1 case.
BTW, the binv/redc part uses an umulhi operation. Maybe we should add
that to longlong.h?
I believe umul_ppmm with unused low result is adequate. When there are
separate instructions for low and high parts, we have made sure to
implement umul_ppmm in such a way that the compiler will eliminate the
How far did you come with gcdext_1? Do you have any timing improvements
to report? :-)
Please encrypt, key id 0xC8601622
More information about the gmp-devel