Better b^e mod m

Torbjorn Granlund tg at gmplib.org
Tue Oct 16 11:43:12 CEST 2012

Dear fellow GMP developers!

Sorry for the long post.  It brings up an important issue, so please
take some time to read it.

We have long had a good implementation of mpz_powm, where the base,
exponent, and modulus arguments are all mpz_t's (at least since April
2000 when Paul Zimmermann rewrote it).

The function mpz_powm_ui, which takes an 'unsigned long' exponent
argument, is not as great.  Will Galway recently reminded us about its
shortcomings in a post to gmp-bugs.  Now I added some text and a diagram
to gmplib.org/devel/.

Summary: mpz_powm_ui is slower than mpz_powm when the exponent > 20
(approximately).  Furthermore, mpz_powm performs horribly for tiny
exponents like 2 and 3, but one may argue that such exponents will
rarely be used there.

The reason for mpz_powm_ui's slowness is that it doesn't use any of
the fancy techniques of mpz_powm (mainly by means of mpn_powm).

Disregarding the specific functions, I see a few operand combinations
that we might want to address:

(1) Tiny exponents such as 2, 3.
(2) Bases that are much smaller than the mod argument.
(3) Exponents that are huge compared to the mod argument.
(4) Plain same-size arguments.

Any more?

I will discuss the 4 cases in turn.

For e = 2 we should just multiply and perform an Euclidean
division.

For e = 3 we will probably do (b^2 mod m) * b mod m, which means we have
multiply invariance for b and division invariance for m.  We could
"transform" b once, and do whatever precomputation is useful for m.
Again, I doubt Montgomery repr will help; we should stay Euclidean.

For e = 4, we actually have less invariance, since (b^2 mod m)^2 mod m
has no multiply invariance, only division invariance.

For some e, Montgomery will start to help, but it will not be
perfectly linear in e.  We might want to determine this per e under
some threshold.

Similarly, from some e tabling powers of b will help.  This too will
depend on individual e values; e = 2^k will never want tabling since
we will compute these powers with squarings only.

There will be e values when neither Montgomery nor tabling will help
(e.g. e = 2), there will be e values where Montgomery will help by
tabling will not (e.g. e = 2^8), there will be e values where
Montgomery and tables will help (e.g. any random large e).  I doubt
there will be any e where Montgomery will not help but tabling will.

Lets go to case (2).  I sketched an algorithm at
https://gmplib.org/devel/ for this.  The (obvious) idea is to start
doing exponentiation without any mod operations, and thus certainly
without Montgomery repr.  This saves not only due to the
non-reductions, but also since the multiplies by small b in plain repr
are cheap.  When the intermediate result operand becomes >= m, we
decide whether to, for the remainder of the exponentiation, go into
Montgomery repr to allow for Hensel division or stay Euclidean.  We
should do the latter only if we're basically done.

For (3) we could reduce e mod |m*|, the order of the underlying
multiplicative group...  We should not walk down that path, I think,
even if we could of course factor small m quickly.

Case (4) is a case mpz_powm handles reasonably well today.  But we
could speed it up somewhat by saving "transforms" if powers of b.

--
Torbjörn