hgcd1/2
Torbjörn Granlund
tg at gmplib.org
Tue Sep 3 14:33:26 UTC 2019
nisse at lysator.liu.se (Niels Möller) writes:
Here's a div2 function that does that. Together with the deleted div1,
it gives a 25% (!) speed up of gcd in the range 3-10 limbs, benchmarked
on my broadwell machine. And should do well on all machines with decent
small-quotient division.
Cool!
And broadwell (while not on my list) presumably have the same slowish
division as haswell and skylake, as it came between those chips.
Amd and ARM processors should get an even greater boost.
I am curious how these GCD implementations compare:
1. The gcd_33 attached for its only use case n = 3. It runs on Haswell
and newer and on Ryzen (but also on bd4 aka Excavator)
2. The present GNP code with your improved div1 and div2.
3. Marco's mpn_gcd_basecase.
It would be premature to decide if there is room for a mpn_gcd_basecase,
but it would be interesting to know in what ballpark performance is.
Only mpn_gcd_33 is close to optimal, perhaps a few percent could be
shaven off by swapping some instructions around. A table based
mpn_gcd_33 could be perhaps 50% faster, but that remains to be tested.
Marco's mpn_gcd_basecase code could also probably be improved, but I
have no estimate here. If nothing else, we could certainly use larger
tables therein.
If a fixed-size entity like mpn_gcd_33 outperforms anything else we can
come with, I see no reason not to keep going to mpn_gcd_44 etc. Before
you ask, we should not provide mpn_mul_33 or mpn_mul_44; gcd is a
special case as it runs for such a long time, so saving on bookkeeping
has a much bigger impact than for other elementary operations.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gcd_33.asm
Type: application/octet-stream
Size: 4211 bytes
Desc: not available
URL: <https://gmplib.org/list-archives/gmp-devel/attachments/20190903/ad261361/attachment.obj>
-------------- next part --------------
--
Torbjörn
Please encrypt, key id 0xC8601622
More information about the gmp-devel
mailing list