udiv_qr_3by2 vs divappr

Sat Nov 17 10:06:40 UTC 2018

(Commenting a a few months old message).

paul zimmermann <Paul.Zimmermann at inria.fr> writes:

> scaling the dividend is not mandatory. You can first reduce the upper n-1
> limbs of the original dividend using the scaled divisor, then reduce the
> last limb using the original divisor. This is especially interesting if only
> the remainder is needed.

It would be nice to avoid having to clobber all of the dividend. Not
scaling should help a bit.

> In fact, the multiplier has one limb + 1 bit, if the divisor is normalized,
> so k*B = \beta^{n+1} + B' with B' < \beta^n (see Algorithm 1.7 in [1]).

And there are some other choices for B'. The given algorithm gives
with 0 <= B' < B (possible with a off-by-one error on my side).

One could also choose B < 0: or -B <= B' < 0. That would leet us use
addmul_1 rather than submul_1 in the loop, and k would be the same thing
as a (n+1)/n reciprocal.

I guess one could also arrange |B'| <= B/2 to get one bit more of
precision (or |B'| < B with a one bit smaller k).

> [1] https://members.loria.fr/PZimmermann/mca/mca-cup-0.5.9.pdf

To summarize my understanding of this algorithm: We need one O(n)
multiply at the start and one at the end (steps 2 and 11 in algorithm
1.7).

Then we have to compute one reciprocal of roughly one limb (step 1), and
roughly one quotients limb (step 10). These will be two limbs in the
un-normalized case. The rest is a plain loop with submul_1 and an
unlikely adjustment.

In the other schoolbook division algorithms, we do quotient
approximation in each step basically like

  U - (U k / \beta^n) D
       `candidate q'

Svoboda rearranges this as

  U - (U/\beta^n) (k D)

Which might be a win, because the needed adjustment steps, omitted in
the first equation above, makes the n quotient approximations more
expensive than the upfront multiplication (k D).

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.