Likely GMP bug

Niels Möller nisse at
Wed May 30 08:20:53 UTC 2018

"Marco Bodrato" <bodrato at> writes:

> ... the effect is that in many cases (if U don't need reduction modulo V)
> the trailing zeros of U are removed twice.

But you may be right that it's important for performance to avoid a
redundant count_trailing_zeros on u.

It seems tricky to avoid that, without code duplication or complications
in the code flow. For my personal preference, tradeoff threshold between
goto and duplicated code is around 5-10 duplicated code lines.

Thinking about micro optimizations... Consider

>       count_trailing_zeros (c, ulimb);
>       ulimb = (ulimb >> 1) >> c;


>       count_trailing_zeros (c, ulimb);
>       ulimb >>= (c + 1);

I wonder if maybe it's preferable to always use the first variant? Both
should compile to three instructions (assuming we have a ctz
instruction), ctz + shift + shift, or ctz + add + shift. But with the
first variant, the first shift can be issued in parallel with ctz,
reducing the critical path of the gcd loop.

We'd need some benchmarking to see if it makes a difference.


Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.

More information about the gmp-bugs mailing list