Improved gcd_1 code

Niels Möller nisse at lysator.liu.se
Tue Mar 13 13:38:44 CET 2012


Torbjorn Granlund <tg at gmplib.org> writes:

> A properly sized table will beat any count_trailing_zeros which takes
> more than about 5 cycles.

Ok. I take it this is still in in the gcd context, where the count is
expected to be small?

> I am not sure what an "unrolled popcount"
> might mean, but I suspect it is a lot more than 5 cycles.

I just meant something like

#define count_trailing_zeros (count, x) do { \
  mp_limb_t = _t = (x); \
  _t = (_t-1) & ~_t; \
  _t = ((_t >> 1 ) & 0x55..55) + (_t & 0x55..55); \
  _t = ((_t >> 2 ) & 0x33..33) + (_t & 0x33..33); \
  ... \
  (count) = _t; \
} while (0)

Not useful in the gcd context (where the count is expected to be small),
possibly never.

>   > a count_trailing_zeros_gcd should be added which is to be optimised
>   > for small counts.

By the way, are there any uses of count_trailing_zeros where the result
is *not* expected to be small, for random inputs to the calling
function? Maybe gcd is not so special?

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.


More information about the gmp-devel mailing list