udiv_qr_3by2 vs divappr

Wed May 9 13:43:59 UTC 2018

paul zimmermann <Paul.Zimmermann at inria.fr> writes:

> if you use udiv_qr_3by2 to divide (say) a 6-limb number by a 3-limb number,
> then in the schoolbook loop you might have {n2, n1} = {d1, d0}, in the case
> where n0 is smaller than the next divisor limb d[-1].
>
> If it is the responsibility of the caller to ensure {n2, n1} < {d1, d0},
> then every caller of udiv_qr_3by2 must deal with that special case.

I think the reason udiv_qr_3by2 and udiv_qrnnd_preinv requires n < d is
that returning GMP_NUMB_MAX for n == d requires one additional check,
and some callers (in particular mpn_div_qr_1 and mpn_div_qr_2) don't
need that.

Tradeoffs may be different for divappr_2.

> In my opinion, it would be simpler to allow {n2, n1} = {d1, d0}, and
> return q=GMP_NUMB_MAX in that case.

I've tried to analyse the problem of divappr_2 sometimes overflowing q
and trying to return GMP_NUMB_MAX + 1. To avoid that, it turns out we
need a check

  if ({n_2, n_1} >= {d_1, d_0} - 2)
    q = GMP_NUMB_MAX;

(and that quotient is still approximate, so we can't omit the schoolbook
division adjustment step). May still want to leave that check outside of
divappr_2, to take the computation of {d_1, d_0} - 2 out of the loop. My
current version of the schoolbook division code does something like

  d1m2 = d_1 - (d0 < 2);

  for (...) 
    {
       if (UNLIKELY (n2 >= d1m2) && (n2 > d1m2 || (n1 >= d0 - 2)))
         q = GMP_NUMB_MAX;
       else
         q = divappr_2(...);
       submul_1 + adjustment
    }

Left to do is strict analysis of the case d0 == 0, the code seems to
work fine without any special case for that, even the computation of r1
(based on the assumption that d0 > 0) is off by one in this case.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.