New thresholds in table

Niels Möller nisse at
Sun Nov 13 15:58:09 CET 2011

Torbjorn Granlund <tg at> writes:

> <>.

Speaking of thresholds: It seems HGCD_THRESHOLD is pretty low now, in
the range 50-150 limbs for most listed machines. And HGCD_APPR_THRESHOLD
a bit smaller still, on several machines. I remember HGCD_THRESHOLD as
typically being around 200, but I may be wrong.

Now, hgcd_appr seems to be some 10% faster than hgcd in the range from a
few dozen limbs up to 1000. I had expected the threshold for hgcd_reduce
to be below 1000 limbs.

The hgcd_reduce threshold compares:

 1. hgcd(n/2) + 4 mul(n/2, n/4)

 2. hgcd_appr(n/2) + 4 mulmod_bnm1(3n/4, n/4)

If, for low sizes, one tries mullo instead of mpn_mulmod_bnm1, then one
would get instead

 2. hgcd_appr(n/2) + 4 mul(n/2, n/4) + 4 mullo(n/4)

so if hgcd_appr saves 10%, then method 2 wins when 4 mullo(n/4) < 0.1
hgcd(n/2). Also, it seems that in this range one mul(n) is roughly 10%
of hgcd, so method 2 should win when 4 mullo(n/4) < mul(n/2). Which
makes me think it could make sense in school book range and very early
Karatsuba/Mulders range only, which isn't very interesting.

So I wonder if it's possible to get the threshold down. Improvements to
hgcd_appr would help, naturally. Using mulmid for the first recursive
call, where we have cancellation *and* can discard low limbs, might give
some speedup for small sizes.


Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.

More information about the gmp-devel mailing list