Sat, 9 Nov 2002 03:06:50 +0000
On Thursday 07 Nov 2002 1:01 pm, Karl Hasselstr=F6m wrote:
> On 2002-11-07 09:43:28 +0000, Jason Moxham wrote:
> > I see at
> > http://www.swox.com/gmp/projects.html
> > under Quotient only division that gmp doesn't yet have it. I can't
> > see how that method described would work , the quoitient is
> > uncertain in the lowest couple of bits ( on average) , so on average
> > the remainder needs to be calculated all except lowest couple of
> > bits, not much of saving.
What I should of said , is that I understand how and why it is correct , =
don't understand why it's faster.
> Given an approximate quotient q to floor(N/D), calculate an
> approximate remainder r =3D N - D*q. If 0 <=3D r < D, q and r are the
> correct quotient and remainder. Otherwise, make corrections in one of
> two ways:
> * While r < 0, let r =3D r - D and q =3D q + 1.
> * While r >=3D D, let r =3D r + D and q =3D q - 1.
> The result is correct, since r =3D N - D*q and 0 <=3D r < D, and if the
> approximate quotient was not too far off then the number of correction
> steps won't be high.
> Now, that algorithm gives quotient and remainder using two
One of the muls is D*q , and the other mul is ? ( the effective time to=20
calculate the quotient ?)
> But, if we don't need the remainder, we can calculate
> just the most significant limb of r. Almost always, this will be
> sufficient for the above algorithm, and in the cases where it's not
> enough -- when the most significant limb of r is the same as for D,
> for example, so that we can't tell for sure, if r < D or not, we can
> calculate all the limbs of r.
Yes , but how do you calculate this highest limb of r ? surely on average=
involves a lot of muls ?
> So in k cases of 2^32 (assuming 32-bit limbs), we still need two
> multiplications, otherwise we need just one, so on average we need 1 +
> k/2^32 multiplications. k is a rather small number -- if the
> approximate quotient has an error of at most 2, my guess is that k is
> something like 5 or 10. So the average time is reduced to about
> 1.000000001 multiplications (for some suitable number of zeros).