bdiv vs redc
tg at gmplib.org
Thu Jun 28 14:26:51 CEST 2012
nisse at lysator.liu.se (Niels Möller) writes:
> A comment says:
> /* FIXME: Add ASSERTs for allowable overlapping; i.e., that qp = np is OK,
> but some over N/Q overlaps will not work. */
> I think we changed this during last year. Putting the quotient in that
> place is a workaround for callers in the absence lf an _r function variant.
> Are we talking about different functions?
I'm looking at mpn_sbpi1_bdiv_qr. I suspect that comment doesn't take
the current implementation into account. The typical loop (for the case
qn <= dn) uses these stores:
np[i] = mpn_addmul_1 (np + i, dp, dn, q);
qp[i] = ~q;
If q were to be stored at the low end of the n input, np == qp, those
two stores collide.
Maybe, but we talked about this and after the discussion I checked in a
change described like this:
2011-12-10 Torbjorn Granlund <tege at gmplib.org>
* mpn/generic/sbpi1_bdiv_q.c: Delay quotient limb stores in order to
allow quotient and dividend to completely overlap.
* mpn/generic/sbpi1_bdiv_qr.c: Likewise.
Perhaps it doesn't work, or does not always work (depending on relative
operand sizes). I cannot remember what usage needed this change, it
might have been the new binomomial code.
For redc, tradition mandates the low end. But maybe callers could
accept the high end without additional overhead? That needs some
investigation; if that can work out it will simplify things.
High end seems more "natural" for this operation, and for the calls from
within dc_bdiv and mu_bdiv, I would be surprised if using anything but
the high end doesn't cause trouble. Using the low end as alien in the
same way as having euclidean division place the remainder at the high
Don't exaggerate...! And what is "natural" is less important what makes
things smooth, we're talking mpn after all!
More information about the gmp-devel