Small-base modexp

Wed Oct 2 13:29:38 UTC 2019

"Marco Bodrato" <bodrato at mail.dm.unipi.it> writes:

  I tried... the attempt is attached.

  I wrote the doubling-modulo step with 3 possible operations:
   if it can not overflow, simply lshift;
   otherwise choose between (x*2-m) and ((x-m)*2)

Do you choose that at each step?  That might be a cause for delay due to
branch miss.

  I had to write special code for size=1, because my first try was slower
  than the generic code. The attached code is faster than the current (when
  base=2)  for all sizes on my laptop.
  But on shell it is not!

Let me make sure I understand what you're comparing.

Are you comparing mpz_powm with base = 2 with your mew special code for
base = 2 for size = 1?

If your code does not win, I'd suggest that there is room for some
improvement.  :-)

  overhead 0.000000002 secs, precision 10000 units of 2.86e-10 secs, CPU
  freq 3500.08 MHz
               mpz_powm    mpz_powm.2    mpz_powm.3
  1         0.000004322       #0.8082        0.9884
  2         0.000007201        1.0723       #0.9836
  3         0.000013274       #0.9465        0.9928
  4         0.000020143       #0.9408        0.9822
  6         0.000033094       #0.8662        0.9964
  9         0.000062521       #0.8396        0.9964
  14        0.000137914       #0.8000        0.9996
  22        0.000317164       #0.7911        1.0061
  35        0.000779100       #0.7662        0.9771
  56        0.001738323       #0.7941        1.0011

  So, let's say that, asymptotically, «2^e mod m is "obviously" more
  efficient than general modular exponentiation». For small sizes it is not
  so easy to write specialised functions :-)

I'd expect to see significant speedup for slower functions like gcd,
powm and jacobi from specialisation.

Checking for particular operand values in general adds more overhead
than is possible to get back.  For for slower functions it is worth it.

-- 
Torbjörn
Please encrypt, key id 0xC8601622