udiv_qr_3by2 vs divappr
Niels Möller
nisse at lysator.liu.se
Wed May 9 13:43:59 UTC 2018
paul zimmermann <Paul.Zimmermann at inria.fr> writes:
> if you use udiv_qr_3by2 to divide (say) a 6-limb number by a 3-limb number,
> then in the schoolbook loop you might have {n2, n1} = {d1, d0}, in the case
> where n0 is smaller than the next divisor limb d[-1].
>
> If it is the responsibility of the caller to ensure {n2, n1} < {d1, d0},
> then every caller of udiv_qr_3by2 must deal with that special case.
I think the reason udiv_qr_3by2 and udiv_qrnnd_preinv requires n < d is
that returning GMP_NUMB_MAX for n == d requires one additional check,
and some callers (in particular mpn_div_qr_1 and mpn_div_qr_2) don't
need that.
Tradeoffs may be different for divappr_2.
> In my opinion, it would be simpler to allow {n2, n1} = {d1, d0}, and
> return q=GMP_NUMB_MAX in that case.
I've tried to analyse the problem of divappr_2 sometimes overflowing q
and trying to return GMP_NUMB_MAX + 1. To avoid that, it turns out we
need a check
if ({n_2, n_1} >= {d_1, d_0} - 2)
q = GMP_NUMB_MAX;
(and that quotient is still approximate, so we can't omit the schoolbook
division adjustment step). May still want to leave that check outside of
divappr_2, to take the computation of {d_1, d_0} - 2 out of the loop. My
current version of the schoolbook division code does something like
d1m2 = d_1 - (d0 < 2);
for (...)
{
if (UNLIKELY (n2 >= d1m2) && (n2 > d1m2 || (n1 >= d0 - 2)))
q = GMP_NUMB_MAX;
else
q = divappr_2(...);
submul_1 + adjustment
}
Left to do is strict analysis of the case d0 == 0, the code seems to
work fine without any special case for that, even the computation of r1
(based on the assumption that d0 > 0) is off by one in this case.
Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel
mailing list