A perfect power, and then?

Tue Oct 30 11:30:09 CET 2012

nisse at lysator.liu.se (Niels Möller) writes:

  By the way, I wonder if there are some circumstanses when

    r <-- a^{1/k}  (mod 2^n)

  is more effiently computed as

    k' <-- k^{-1} (mod 2^{n-1})  [binvert]
    r <-- a^{k'} (mod 2^n)       [powlo]

  We get a typically cheaper newton iteration (for binvert), but a
  typically much larger exponent for powlo. I doubt it's useful for large
  n, since then k' is *much* larger than k. But maybe for some types of
  small operands.

This is going to help for large enough k and small enough n...  Will
that occur?

  > I am afraid I don't appreciate the difference.

  Say that, at some point, we have n limbs of precision. Then one
  iteration of kth root (k odd) gives us precisely 2n limbs. While one
  iteration of bsqrt gives us 2n limbs minus one bit (or is it minus 2
  bits? I don't remember now). To make use of the valid bits in that top
  limb (rather than just discarding them and saying that we now have 2n-1
  valid limbs, which is particularly bad for n == 1...), we need some
  book-keeping in units of bits.

I suppose that in practice for both bsqrt and broot, we'll get a
progression of limb sizes that is not optimial (except for root when
n=2^t and for sqrt when n=2^t+1).

But for root is might be the case that iterations are much more
expensive since they are something like O(f(n)log(k)) for computing
a^(1/k) mod n.  Even improving an initial 4-bit approximation to say 20
bits will be significant work.  We don't want to go to 64 bits
internally just because we don't know the actually needed precision.  It
might make more sense to have a coarser size in bsqrt, since its
iteration to b bits of precision will be faster.

Does this make sense?

-- 
Torbjörn