New thresholds in table
nisse at lysator.liu.se
Sun Nov 13 15:58:09 CET 2011
Torbjorn Granlund <tg at gmplib.org> writes:
> I have added HGCD_APPR_THRESHOLD and HGCD_REDUCE_THRESHOLD to
Speaking of thresholds: It seems HGCD_THRESHOLD is pretty low now, in
the range 50-150 limbs for most listed machines. And HGCD_APPR_THRESHOLD
a bit smaller still, on several machines. I remember HGCD_THRESHOLD as
typically being around 200, but I may be wrong.
Now, hgcd_appr seems to be some 10% faster than hgcd in the range from a
few dozen limbs up to 1000. I had expected the threshold for hgcd_reduce
to be below 1000 limbs.
The hgcd_reduce threshold compares:
1. hgcd(n/2) + 4 mul(n/2, n/4)
2. hgcd_appr(n/2) + 4 mulmod_bnm1(3n/4, n/4)
If, for low sizes, one tries mullo instead of mpn_mulmod_bnm1, then one
would get instead
2. hgcd_appr(n/2) + 4 mul(n/2, n/4) + 4 mullo(n/4)
so if hgcd_appr saves 10%, then method 2 wins when 4 mullo(n/4) < 0.1
hgcd(n/2). Also, it seems that in this range one mul(n) is roughly 10%
of hgcd, so method 2 should win when 4 mullo(n/4) < mul(n/2). Which
makes me think it could make sense in school book range and very early
Karatsuba/Mulders range only, which isn't very interesting.
So I wonder if it's possible to get the threshold down. Improvements to
hgcd_appr would help, naturally. Using mulmid for the first recursive
call, where we have cancellation *and* can discard low limbs, might give
some speedup for small sizes.
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel