Improved gcd_1 code
tg at gmplib.org
Tue Mar 13 13:19:18 CET 2012
nisse at lysator.liu.se (Niels Möller) writes:
But I'm not sure which loop you're referring to, when zerotab is
disabled, it's using plain count_trailing_zeros macro.
I misread the strip_u_maybe beautyness...
Another alternative for a branch-free count_trailing_zeros would be to
base it on an unrolled popcount (of (x-1) & ~x).
A properly sized table will beat any count_trailing_zeros which takes
more than about 5 cycles. I am not sure what an "unrolled popcount"
might mean, but I suspect it is a lot more than 5 cycles.
> a count_trailing_zeros_gcd should be added which is to be optimised
> for small counts.
Makes sense. I wonder if that should loop, or return MIN(SOME_LIMIT,
correct ctz) leaving to the caller to loop.
I think it should always finish the job, following the common style of
the USE_ZEROTAB code. It should use a separate table, parallel to
More information about the gmp-devel