# Computing powers and their inverses at the same time

Niels Möller nisse at lysator.liu.se
Fri Sep 8 16:04:24 CEST 2006

```Hi all,

this may be wellknown, but I'd like to write it down before I forget
it.

In conversion from binary representation to base 10 (say), the basic
divide-and-conquer method computes powers 10^{2^k}, divides the input
number by the power corresponding to a suitably chosen k, and then
recurses on the quotient (giving the most significant digits) and the
remainder (giving the least significant digits).

In the division by 10^{2^k}, the (approximate) inverse of this power
is computed, to an accuracy of about the same size as the power
itself. The trick is that those inverses can be computed together with
the powers:

Assume that we have computed x_k = 10^{2^k}, of size n bits, and it's
inverse i_k ~ 1/x_k, also with an accuracy of n bits. I.e.,

i_k = (1+e) / x_k

where the relative error e is on the order of 2^{-n}.

Next, x_{k+1} = x_k^2 is computed by a squaring operation. Then the
square of the inverse, i'_{k+1} = i_k^2, is an approximation of
1/{x_{k+1}}, with a relative error of about the 2 e. This value can be
improved by one Newton iteration,

i_{k+1} = i'_{k+1} + i'_{k+1} (1 - x_{k+1} i'_{k+1})

which results in a relative error of about 4 e^2, i.e. an accuracy of
2n - 2 bits.

Do you think this is a useful optimization? As far as I can see, one
can save all newton iterations but the last (and admittedly the most
costly one) for the inversions at each size, which ought to give a
speedup of at least one or a few tens percent.

I imagine it's a little tricky to make it work with the fixpoint
representation of inverses, and keeping the accuracy under control,
but I'm not very familiar with the inversion code.

Regards,
/Niels
```