div_qr_1 interface

Sat Oct 26 18:51:31 CEST 2013

nisse at lysator.liu.se (Niels Möller) writes:

  A long time ago, we choose an interface for sbpi1_div_qr which does
  *not* store the most significant limb; instead it returns it. I think it
  was the intention that a new top-level mpn_div_qr should follow that
  convention, and not store the top limb. I don't remember if that was
  purely for esthetic reasons, or if it also simplified the divide&conquer
  division code.

  Having some of the special division case functions, like mpn_div_qr_1,
  storing the top quotient limb leads to unnecessary complications when
  writing the general mpn_div_qr; it forces mpn_div_qr for n/1 to do the
  top limb itself and call mpn_div_qr_1 with (n-1)/1, which seems like a
  case of bad impedance mismatch.

  But then, requirements for div_qr_1 don't necessarily translate to
  requirements for the primitive division loops we're discussing now.

  For the recent mpn_div_qr_1 and mpn_div_qr_1n_pi1, it turned out to work
  fine to have mpn_div_qr_1 for n/1 do the top quotient limb up front, and
  let mpn_div_qr_1n_pi1 generate n-1 quotient limbs (but n-1 *full* limbs,
  since we pass in a high u limb, reduced to uh < d).

  It's unclear to me if the best design for the unnormalized case is to
  follow mpn_div_qr_1n_pi1 or not. It get be clearer after writing a
  decent mpn_div_qr_1u_pi1.

Thanks, this all make sense to me now.  :-)

I think it will not be too hard to adapt the new pi2 code; moving the
current after-loop code to before the loop will probably make the
structure best.  At least the 1n variant.

The 1u case is a bit trickier.  Here, we will always have a final
quotient limb which depends on a remainder limb and a partial dividend
limb.

-- 
Torbjörn