udiv_qr_3by2 vs divappr

Thu Sep 20 11:57:11 UTC 2018

tg at gmplib.org (Torbjörn Granlund) writes:

  I suppose a "naive" implementation scales the full dividend and the full
  divisor by a factor which fits in a limb.  One could in theory save some
  time by scaling just their O(log(Q)) most significant limb.

That paragraph was supposed to read:

I suppose a "naive" Svoboda implementation scales the full dividend and
the full divisor by a factor which fits in a limb.  One could in theory
save some time by scaling just their O(log(Q)) most significant limbs.

  How likely does Svoboda overesimate a quotient limb?

Well, one could of course adjust down it in O(1) time by involving
another set of high limbs.

Here is a somewhat less broken version of my algorithm:

udiv_qr_3by2 (qnext, n1, n0, np[0], np[-1], np[-2], d1, d0, dinv);
for (...)
  {
    // Apply q to most significant partial remainder limbs.
    // Think: Is size = 2 enough or do we need size = 3?
    // Optimisation: Results from last divappr should already have done most of
    // the computation here; we presumably need just one more mul+subtract.
    cy = submul_1 (np - 2, dp + dn - 2, qnext, 2);
    ASSERT (np[0] >= cy);	/* FIXME: Possibly handle spillage */
    np[0] -= cy;

    q = qnext;
    // Now, with the most significant few limbs of the next partial remainder
    // computed to some accuracy, compute q for the next iteration.  This will
    // overestimate q somewhat more often than with the old algorithms.
    udiv_qr_3by2 (qnext, n1, n0, np[0], np[-1], np[-2], d1, d0, dinv);

    // Apply q to all but the most significant partial remainder limbs.
    cy = submul_1 (np - dn, dp, q, dn - 2);

    if (UNLIKELY (cy > np[-2]))
      {
        np[-2] -= cy;
	mpn_sub_1 (np - 1, np - 1, 2, 1);
	// q was too large, qnext is really off.  Recompute it!
	// (We could perhaps recompute it more cleverly.)
	ASSERT_CARRY(mpn_add_n (np - dn, np - dn, dp, dn));
	udiv_qr_3by2 (qnext, n1, n0, np[0], np[-1], np[-2], d1, d0, dinv);
      }
    else
      np[-2] -= cy;

    np--;
  }

-- 
Torbjörn
Please encrypt, key id 0xC8601622