# udiv_qr_3by2 vs divappr

Torbjörn Granlund tg at gmplib.org
Thu Sep 20 11:57:11 UTC 2018

```tg at gmplib.org (Torbjörn Granlund) writes:

I suppose a "naive" implementation scales the full dividend and the full
divisor by a factor which fits in a limb.  One could in theory save some
time by scaling just their O(log(Q)) most significant limb.

That paragraph was supposed to read:

I suppose a "naive" Svoboda implementation scales the full dividend and
the full divisor by a factor which fits in a limb.  One could in theory
save some time by scaling just their O(log(Q)) most significant limbs.

How likely does Svoboda overesimate a quotient limb?

Well, one could of course adjust down it in O(1) time by involving
another set of high limbs.

Here is a somewhat less broken version of my algorithm:

udiv_qr_3by2 (qnext, n1, n0, np, np[-1], np[-2], d1, d0, dinv);
for (...)
{
// Apply q to most significant partial remainder limbs.
// Think: Is size = 2 enough or do we need size = 3?
// Optimisation: Results from last divappr should already have done most of
// the computation here; we presumably need just one more mul+subtract.
cy = submul_1 (np - 2, dp + dn - 2, qnext, 2);
ASSERT (np >= cy);	/* FIXME: Possibly handle spillage */
np -= cy;

q = qnext;
// Now, with the most significant few limbs of the next partial remainder
// computed to some accuracy, compute q for the next iteration.  This will
// overestimate q somewhat more often than with the old algorithms.
udiv_qr_3by2 (qnext, n1, n0, np, np[-1], np[-2], d1, d0, dinv);

// Apply q to all but the most significant partial remainder limbs.
cy = submul_1 (np - dn, dp, q, dn - 2);

if (UNLIKELY (cy > np[-2]))
{
np[-2] -= cy;
mpn_sub_1 (np - 1, np - 1, 2, 1);
// q was too large, qnext is really off.  Recompute it!
// (We could perhaps recompute it more cleverly.)
ASSERT_CARRY(mpn_add_n (np - dn, np - dn, dp, dn));
udiv_qr_3by2 (qnext, n1, n0, np, np[-1], np[-2], d1, d0, dinv);
}
else
np[-2] -= cy;

np--;
}

--
Torbjörn