Small operands gcd improvements
Torbjörn Granlund
tg at gmplib.org
Fri Jul 26 13:37:03 UTC 2019
GMP's gcd performance for two-limb operands is quite poor.
The gcd_2 function of mpn/generic/gcd.c uses a loop with an randomly
taken branch, and therefore is slow. This function is also not
overridable in assembly code. It drops into mpn_gcd_1 except in
unlikely cases, but mpn_gcd_1 is overkill and incurs unnecessary
overhead as it accepts an n-limb operand.
We have mpn_gcd_1 in assembly for many machines, but no "mpn_gcd_11"
which accepts just two 1-limb operands.
What can we do?
1. Define mpn_gcd_11.
2. Add a new entry point mpn_gcd_11 to existing assembly mpn_gcd_1.
3. Allow for overrriding gcd_2 in assembly (and probably rename it
gcd_22).
4. Implement gcd_22 in assembly for main CPU families.
This is not much work.
--
Torbjörn
Please encrypt, key id 0xC8601622
More information about the gmp-devel
mailing list