Improved gcd_1 code
Niels Möller
nisse at lysator.liu.se
Tue Mar 13 13:38:44 CET 2012
Torbjorn Granlund <tg at gmplib.org> writes:
> A properly sized table will beat any count_trailing_zeros which takes
> more than about 5 cycles.
Ok. I take it this is still in in the gcd context, where the count is
expected to be small?
> I am not sure what an "unrolled popcount"
> might mean, but I suspect it is a lot more than 5 cycles.
I just meant something like
#define count_trailing_zeros (count, x) do { \
mp_limb_t = _t = (x); \
_t = (_t-1) & ~_t; \
_t = ((_t >> 1 ) & 0x55..55) + (_t & 0x55..55); \
_t = ((_t >> 2 ) & 0x33..33) + (_t & 0x33..33); \
... \
(count) = _t; \
} while (0)
Not useful in the gcd context (where the count is expected to be small),
possibly never.
> > a count_trailing_zeros_gcd should be added which is to be optimised
> > for small counts.
By the way, are there any uses of count_trailing_zeros where the result
is *not* expected to be small, for random inputs to the calling
function? Maybe gcd is not so special?
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel
mailing list