Small operands gcd improvements

Torbjörn Granlund tg at
Fri Jul 26 13:37:03 UTC 2019

GMP's gcd performance for two-limb operands is quite poor.

The gcd_2 function of mpn/generic/gcd.c uses a loop with an randomly
taken branch, and therefore is slow.  This function is also not
overridable in assembly code.  It drops into mpn_gcd_1 except in
unlikely cases, but mpn_gcd_1 is overkill and incurs unnecessary
overhead as it accepts an n-limb operand.

We have mpn_gcd_1 in assembly for many machines, but no "mpn_gcd_11"
which accepts just two 1-limb operands.

What can we do?

1. Define mpn_gcd_11.
2. Add a new entry point mpn_gcd_11 to existing assembly mpn_gcd_1.
3. Allow for overrriding gcd_2 in assembly (and probably rename it
4. Implement gcd_22 in assembly for main CPU families.

This is not much work.

Please encrypt, key id 0xC8601622

More information about the gmp-devel mailing list