Possible gcd/hgcd/gcdext improvements
Niels Möller
nisse at lysator.liu.se
Thu Oct 6 12:22:13 CEST 2011
nisse at lysator.liu.se (Niels Möller) writes:
> Seems to work fine, even if I don't yet fully understand the pattern in
> which n (current size) and s (current target size for the stop
> condition) are updated, though.
Now I've done a little benchmarking (on intel x86_64). For very large
inputs (which are not very relevant), the speedup over hgcd_lehmer is
29%. For comparison, the speedup of
mpn_sbpi1_divappr_q over mpn_sbpi1_div_qr is roughly 100%.
Hmm. I guess it works like this: For hgcd_lehmer, roughly one half of
the work is updating the matrix elements, and the remaining half of the
work is the reductions of the current remainders. If hgcd_appr reduces
the latter part to half, while not affecting the first part, the net
saving is 1/4 of the work, or 33% speedup.
In the more interesting range of 100-500 limbs, speedup is 10%-20% (I
think the linear term, from all the calls to hgcd2, is significantly
larger than for sbpi1_div*).
As you may remember, hgcd_lehmer and hgcd_appr don't compute exactly the
same thing; hgcd_lehmer returns a matrix *and* the corresponding reduced
numbers, while hgcd_appr returns the matrix only.
So to do a fair comparison to hgcd_lehmer, one should compare
hgcd_lehmer to hgcd_appr + the multiplications needed to recover the
reduced numbers. The naive way to do this needs 4 multiplications of
size n x n/4, and since the time needed for one n x n multiplication is
about 10% or less of the time for hgcd_lehmer in this range, I think
this could give a small speedup.
And then one there are a couple of different tricks which could be used
to recover the reduced numbers more efficiently: Using wraparound
arithmetic to take advantqage of the known cancellation. And idea that
has been mentioned is to split numbers in pieces to get a product of two
2x2 matrices, which can then be computed with Strassen multiplication
(that trick might not mix well with wraparound arithmetic, though).
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel
mailing list