Division call in mpn_gcd

Fri Feb 24 14:51:00 CET 2012

nisse at lysator.liu.se (Niels Möller) writes:

  Torbjorn Granlund <tg at gmplib.org> writes:

  > Except that redc does not accept independent sizes.  Therefore we need
  > to use bdiv_qr.

  I think it should be easy to generalize at least redc_1 and redc_2 to
  accept independent sizes. Might require some more work for redc_n, which
  would need some kind of block-wise processing.

For some reason I have forgotten, we decided to stick to their current
definitions.  I suppose we'd basically just need to modify the outer
loop count, and perhaps handle carry-out from addmul_N differently.

  > 4. Your point 3.  Which thresholds are you talking about?

    REDC_1_TO_REDC_2_THRESHOLD
    REDC_2_TO_REDC_N_THRESHOLD

I'd say REDC_1_TO_REDC_2_THRESHOLD is more-or-less a plain comparison if
the speed of addmul_1 vs addmul_2, and the inversion costs.  They are
measured for one size operand, using two size operands ought to mean
that sqrt(un*vn) should be compared to REDC_1_TO_REDC_2_THRESHOLD.
(I.e., un*vn should be compared o REDC_1_TO_REDC_2_THRESHOLD^2.)

This reasoning disregards the constant term of the speed of an addmul_1
or addmul_2 invocation.

  Using bdiv_qr is surely an improvement, but I think we really ought to
  have some division function which doesn't require storage for the
  unwanted quotient. Should that function be bdiv_r, or redc, or even
  div_r?

I suppose the gcd functions are quite tolerant, at least for an initial
reduction like this one.

  IIRC, bdiv and redc have slightly different notions on what the
  "remainder" is, but I imagine either variant would fine for the gcd
  reduction.

Perhaps this is the reason for keeping redc separate?

  When un >> vn, gcd(u, v) shouldn't need O(un) scratch space (btw, this
  calls for an initial reduction also in mpz_gcd).

Makes sense.

  There are a couple of other divisions in the gcd code where the quotient
  is similarly unwanted: The initial division in mpn_gcdext,

Really?  Doesn't that quotient affect the cofactors?

I don't think we need to finalise redc vs (b)div_r before we modify the
gcd code to use hensel division; If tdiv_qr is good enough now, bdiv_qr
ought to be good enough to be worth as a separate improvement.  Sure, it
is cleaner to keep the quotient out-of-the-way (and it will very
slightly simplify the scratch space allocations).

-- 
Torbjörn