bodrato at mail.dm.unipi.it bodrato at mail.dm.unipi.it
Mon Dec 7 11:54:59 CET 2009

```Thanks Niels,

> I haven't looked closely neither at your code or at left-to-right
> inversion in general. But maybe you can get some ideas from Alg 3.5
> ApproximateReciprocal (page 121 in version 0.4) in Brent's and
> Zimmermann's MCA book?

Well... I used exactly that book for inspiration :-) but I had a glance to
4.2.2 Newton's Method for Reciprocals (page 146 of the same version),
where the word "approximate" does not appear :-)

Anyway, I'll try to adhere strictly to the Algo 3.5, to see if the
"possibly off by one approximation" can be removed.

> 1.  T = X_k * A_k
> 2.  E = 1 - T
> 3.  Small number of O(n) adjustments in the case E < 0

Now, I handle the two cases E = 1-T or E=T-1, to have E >=0 always...

> 4.  X_{k+1} = X_k + X_k * E

... so that some time I have X_k - X_k * E ... I should check if results
are correct when E=0.

> Here A_k is the most significant part of the input, of size appropriate
> for the iteration. A is assumed to be "strictly" normalized with high
> limb >= B/2, to relax that I guess one would need one more limb in A.

One more limb in the result, yes... because now a virtual limb containing
one is assumed, but inverses of numbers smaller than one half needs
inverses bigger than one-point-<whatever> (the function actually computes
<whatever>).

> The first multiplication (line 1) is an obvious target for wraparound
> logic. The second multiplication (line 4) would be some kind of mulhi,
> if I understand this correctly, where some of the lower limbs of the

Yes, but I did not try this kind of optimisations, yet.

> This algorithm looks quite clean and simple to me, and for large
> inverses I don't think it's a serious drawback to have to do a cheap
> adjustment step for each Newton iteration.

The "adjustment step" I'm using now... is needed only for the last
iteration (the intermediate ones are adjusted by the next iteration), but
it is _not_ cheap.

Regards,
Marco

--
http://bodrato.it/software/

```