Small operands gcd improvements
Torbjörn Granlund
tg at gmplib.org
Wed Aug 7 09:28:13 UTC 2019
nisse at lysator.liu.se (Niels Möller) writes:
Just not sure what order to do things. That patch just adds the gcd_11
entrypoint for that arch, with nothing but speed using it. (So I realize
it's not as tested as I thought).
At least it is fast!
If it works out to replace one foo/gcd_1.asm with foo/gcd_11.asm, one by
one, that's faster progress (and the HAVE_NATIVE_mpn_gcd_11 test in
gcd_11.c is unncecessary if we go that way).
I have made the trivial conversion of all gcd_1.asm code but x86-32. I
might check it in soon. Removing gcd_1.asm can be as a separate check
in.
We will see a slight slowdown with all calls which go through gcd_1, but
we should regain some of that when callers avoid gcd_1 for things which
actualy need gcd_11. (There will be some slowdown for things like
3-limb with 1-limb gcd, which will need another call.)
--
Torbjörn
Please encrypt, key id 0xC8601622
More information about the gmp-devel
mailing list