bdiv vs redc
nisse at lysator.liu.se
Thu Jun 28 16:06:38 CEST 2012
Torbjorn Granlund <tg at gmplib.org> writes:
> 2011-12-10 Torbjorn Granlund <tege at gmplib.org>
> * mpn/generic/sbpi1_bdiv_q.c: Delay quotient limb stores in order to
> allow quotient and dividend to completely overlap.
> * mpn/generic/sbpi1_bdiv_qr.c: Likewise.
> Perhaps it doesn't work, or does not always work (depending on relative
> operand sizes).
I seriously doubt that it works for sbpi1_bdiv_qr. I think it does work
for sbpi1_bdiv_q. That function doesn't use the trick of delaying the
addition of the carry limbs, it just applies the carry limbs one at a
time with mpn_add_1, like
cy = mpn_addmul_1 (np, dp, dn, q);
mpn_add_1 (np + dn, np + dn, i, cy);
> I cannot remember what usage needed this change, it
> might have been the new binomomial code.
I did a quick search, without finding any users.
> Don't exaggerate...! And what is "natural" is less important what makes
> things smooth, we're talking mpn after all!
Say we start with n = 20 limbs,
(most significant limb to the left) and start with reducing by k = 4 < n
limbs, at the low end. If the recursive bdiv operation puts the
remainder in the high end, we get to
which is the complete partial remainder and some trailing zeros.
While if it is put at the low end, we get
where the partial remainder is split into two disjoint areas. Next step
would be to apply the quotient to the untouched "u"-part of the partial
remainder.Say we have a quotient of k = 4 limbs and the high part of d
is also 4 limbs, then the product v = q * dh should be added in as
We can split the add into mpn_add_n and mpn_add_nc (if mpn_add_nc is
available!), and move the upper part down in the process, to get
I imagine it could work without much additional overhead, but it seems a
bit messy and "non-smooth".
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel