bdiv vs redc

Thu Jun 28 14:26:51 CEST 2012

nisse at lysator.liu.se (Niels Möller) writes:

  > A comment says:
  >   /* FIXME: Add ASSERTs for allowable overlapping; i.e., that qp = np is OK,
  >      but some over N/Q overlaps will not work.  */
  > I think we changed this during last year.  Putting the quotient in that
  > place is a workaround for callers in the absence lf an _r function variant.
  >
  > Are we talking about different functions?

  I'm looking at mpn_sbpi1_bdiv_qr. I suspect that comment doesn't take
  the current implementation into account. The typical loop (for the case
  qn <= dn) uses these stores:

        np[i] = mpn_addmul_1 (np + i, dp, dn, q);
        qp[i] = ~q;

  If q were to be stored at the low end of the n input, np == qp, those
  two stores collide.

Maybe, but we talked about this and after the discussion I checked in a
change described like this:

2011-12-10  Torbjorn Granlund  <tege at gmplib.org>

	* mpn/generic/sbpi1_bdiv_q.c: Delay quotient limb stores in order to
	allow quotient and dividend to completely overlap.
	* mpn/generic/sbpi1_bdiv_qr.c: Likewise.

Perhaps it doesn't work, or does not always work (depending on relative
operand sizes).  I cannot remember what usage needed this change, it
might have been the new binomomial code.

  For redc, tradition mandates the low end. But maybe callers could
  accept the high end without additional overhead? That needs some
  investigation; if that can work out it will simplify things.

Agreed.

  High end seems more "natural" for this operation, and for the calls from
  within dc_bdiv and mu_bdiv, I would be surprised if using anything but
  the high end doesn't cause trouble. Using the low end as alien in the
  same way as having euclidean division place the remainder at the high
  end.

Don't exaggerate...!  And what is "natural" is less important what makes
things smooth, we're talking mpn after all!

-- 
Torbjörn