Small-base powm

Tue Mar 25 12:58:12 UTC 2014

Torbjorn Granlund <tg at gmplib.org> writes:

> These tricks are valuable only for limited exponents; as soon as the
> exponent is large, the fraction of optimisable initil operatons will
> become negligible.

The cost of all squarings are about 2*M(n) * exponent size, and in the
case of a small base, there are some savings for the initial squarings.

I think using plain mul and euclidean reduction for the modular
multiplications by the small base will be a win for most exponent sizes.
It makes these operations O(n) each, or O(n * exponent size) total.

Compare to window exponentiation with (large) montgomery-representation,
there the total cost of the modular multiplications are 2*M(n) *
exponent size / k.

So for large n and small base, we have

   esize*(2*M(n) + O(n)), with small euclidean reductions 

   esize*2*M(n) (1+1/k) + precomputation time, for k-bits window

>   * One could handle powers of 2 specially, with shifts instead of
>     multiplies.
>   
> Perhaps then with a separate mpn function.

Makes sense.

>   * Maybe one could extend this to take advantage of addmul_2, if present.
>   
> Not sure to follow you here.  You mean to use mul_2 for the multply by b
> (or b^x for some x) and addmul_2 for some clever (non-principal)
> reduction?

Something like that: Allow multiplications by b or b powers of size two
limbs, and for the reduction, compute a two-limb "quotient" and apply
with mpn_addmul_2.

> I think our overhead for modexp ops with few-limb modulo is currently
> terrible.  But it is not going to be easy to fix this, without resorting
> to full assembly functions.

I think optimization of the small modulo case is mostly independent of
the small base case.

/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.