udiv_qr_3by2 vs divappr

Niels Möller nisse at lysator.liu.se
Sat Apr 28 11:27:36 UTC 2018

```It's fun looking at clever things you did a decade ago, and try to
improve...

I just realized that our 3/2 division, udiv_qr_3by2, work horse for the
linear work of schoolbok division, has one more input than it needs.

We would get a useful quotient if we always passed GMP_NUMB_MAX for the
n0 argument. Which wouldn't provide any useful gain by itself (and
require adjustments where the remainder is used).

But it shows that we could have a function with inputs n2, n1, d1, d0,
computing a suitable quotient approximation. Fixing n0 as above would
mean that we always return

q = f(n2, n1, d1, d0) = floor ({n2, n1, B-1} / {d1, d0})

but we could allow slightly more freedom in what we return.

Inputs are a two limb normlized divisor, {d1, d0}, d1 >= B/2, and two
numerator words, {n2, n1}, which must be less than {d1, d0}. I think
we'd want the same precomputed inverse as for 3/2.

It computes an single word approximate quotient q, roughly B {n2,n1} / {d1,
d0}, which, when applied to larger numbers as part of schoolbook
division, is never too small, at most one too large, and highly likely to
be correct.

If convenient, the function could also return a single word remainder

r = {n1,n0} - q d1 - mulhi(q, d0),

I think such an r would have to be in the range 0 <= r <= d1, with some
subteties on when r == d1 is permissible. We might also want to return
mullo(q, d0), if that's easily available.

Potential I see:

* It might be possible to implement this cheaper than udiv_qr_3by2,
omitting most of the book-keeping and carry propagation involving the
low remainder word r0.

* One limb less to extract and handle specially in the schoolbook
division loops. It would be particularly nice if we try to write a
schoolbook division that doesn't require arguments to be normalized up
front, but instead just normalizes top limbs on the fly.

Regards,
/Niels

--
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.

```