Division interfaces

Tue Mar 15 14:35:15 CET 2011

Torbjorn Granlund <tg at gmplib.org> writes:

> I looked briefly at your code.  Wouldn't it make sense to stay with 3/2
> division, using the 2 x GMP_NUMB_BITS most significant bits from the
> divisor, and the 3 x GMP_NUMB_BITS bits from the partial remainder?

> Thus, feed udiv_qr_3by2 with *exactly* the sequence of parameters it
> would get if we instead had pre-shifted the divisor and dividend.

It makes some sense. I tried that first, and I won't say that it's a bad
way to do it, but it seemed so complicated that I decided to try
something different to get it to work.

The problem is what to do when we have computed (using udiv_qr_3by2)
<n2,n1,n0> = q <d1, d0> + <r1, r0>, and next need to do

  cy = mpn_submul_1(np + ..., dp, dn-2, q);
  <r1, r0> -= some_shift_or_other(cy);

One of the limbs of the partial remainder was already included in part
in the subtraction leading to r0. Should that limb be included in the
submul_1? Then it gets messy since we end up doing some bits of the
partial remainder twice, and need to make sense of the carry limb from
mpn_submul_1.

This is the difficulty I tried to explain earlier, and maybe I'm still
not explaining it clearly. One solution might be to zero the partial
limb of n and d which are extracted into n2, n1, n0, d1, d0, so that
submul_1 doesn't see them. (But to modify d this way would be a gross
interface change)

Or one could exclude the straddling limb from the submul_1 call and
handle it manually. And I think that is more or less equivalent to the
current approach with udiv_qr_4by3.

BTW, if we stick to udiv_qr_4by3, it can most likely be optimized a bit
by using a very sligtly different inverse, taking the additional
(GMP_LIMB_BITS - shift count) bits of d into account.

Regards,
/Niels
-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.