Improved gcd_1 code
Niels Möller
nisse at lysator.liu.se
Tue Mar 13 13:05:02 CET 2012
Torbjorn Granlund <tg at gmplib.org> writes:
> I think a 16 byte zerotab is too small, and should be expanded. (But if
> a tiny tab is to be used, I suppose extraction from a magic limb
> constant is better.) Well, zerotab is unconditionally disabled now and
> a loop is used.
I don't quite remember, but I think I only tried that small zerotab,
found it wasn't an improvement.
But I'm not sure which loop you're referring to, when zerotab is
disabled, it's using plain count_trailing_zeros macro.
Another alternative for a branch-free count_trailing_zeros would be to
base it on an unrolled popcount (of (x-1) & ~x).
> a count_trailing_zeros_gcd should be added which is to be optimised
> for small counts.
Makes sense. I wonder if that should loop, or return MIN(SOME_LIMIT,
correct ctz) leaving to the caller to loop.
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel
mailing list