# A perfect power, and then?

Torbjorn Granlund tg at gmplib.org
Tue Oct 30 11:30:09 CET 2012

```nisse at lysator.liu.se (Niels Möller) writes:

By the way, I wonder if there are some circumstanses when

r <-- a^{1/k}  (mod 2^n)

is more effiently computed as

k' <-- k^{-1} (mod 2^{n-1})  [binvert]
r <-- a^{k'} (mod 2^n)       [powlo]

We get a typically cheaper newton iteration (for binvert), but a
typically much larger exponent for powlo. I doubt it's useful for large
n, since then k' is *much* larger than k. But maybe for some types of
small operands.

This is going to help for large enough k and small enough n...  Will
that occur?

> I am afraid I don't appreciate the difference.

Say that, at some point, we have n limbs of precision. Then one
iteration of kth root (k odd) gives us precisely 2n limbs. While one
iteration of bsqrt gives us 2n limbs minus one bit (or is it minus 2
bits? I don't remember now). To make use of the valid bits in that top
limb (rather than just discarding them and saying that we now have 2n-1
valid limbs, which is particularly bad for n == 1...), we need some
book-keeping in units of bits.

I suppose that in practice for both bsqrt and broot, we'll get a
progression of limb sizes that is not optimial (except for root when
n=2^t and for sqrt when n=2^t+1).

But for root is might be the case that iterations are much more
expensive since they are something like O(f(n)log(k)) for computing
a^(1/k) mod n.  Even improving an initial 4-bit approximation to say 20
bits will be significant work.  We don't want to go to 64 bits
internally just because we don't know the actually needed precision.  It
might make more sense to have a coarser size in bsqrt, since its
iteration to b bits of precision will be faster.

Does this make sense?

--
Torbjörn
```