# sec_invert performance

Niels Möller nisse at lysator.liu.se
Tue Feb 4 09:27:58 UTC 2014

```I'm confused by the pretty bad performance of sec_invert. Try e.g.,

./tune/speed -s 1-200 -P invert mpn_gcdext mpn_sec_invert mpz_powm_sec

Say we have inputs of n bits. Then my understadning of these algorithms
is:

* mpn_gcdext, for sizes of interest, is Lehmer's algorithm, O(n^2), with
a reasonably small constant since it reduces roughly one limb per
iteration.

* mpn_sec_powm does n squarings + redc (and then some n/window_size
multiplies and table lookups). And the squarings and redc are O(n^2),
so total running time should be O(n^3).

* mpn_sec_invert has a loop of 2n iterations. Each iterations makes 9
calls to various O(n) functions, like mpn_cnd_add_n, mpn_rshift, etc.
So it ought to be O(n^2), and a constant factor slower than
mpn_gcdext.

But when I run speed, *both* ratios mpn_invert / mpn_gcdext and
mpz_sec_powm / mpn_gcdext keep growing with n. So mpn_invert is not a
constant factor slower than mpn_gcdext, as I would have expected, and
it's significantly slower than mpn_sec_powm for largish sizes like 100
limbs. So it looks like it's O(n^3), not O(n^2).

Any idea what's going on?

And then, there's also a sharp step at n = 113 or 114 (maybe that's an
artifact of the testmachine, but it's reproducible here).

\$ ./speed -s 100-120  -c -r mpn_gcdext mpn_sec_invert mpz_powm_sec
overhead 5.42 cycles, precision 10000 units of 6.25e-10 secs, CPU freq 1600.00 MHz
mpn_gcdext mpn_sec_invert  mpz_powm_sec
100        #286753.00       97.9985       68.7822
101        #290981.00       96.3812       69.3512
102        #297710.00       94.5395       68.9482
103        #301772.00       94.9747       69.6444
104        #308297.00       94.8706       69.0619
105        #310131.00       96.1709       70.7960
106        #312842.00       96.6907       70.8039
107        #321814.00       96.2753       70.4176
108        #313583.00      100.2901       73.2001
109        #321223.00      100.3469       72.9247
110        #328283.00       99.1086       72.5408
111        #333812.00       99.4312       72.9408
112        #335619.00      100.7185       73.4388
113        #343734.00       46.7804       73.1944
114        #352574.00       46.8327       72.9897
115        #356299.00       46.9179       73.3826
116        #366523.00       46.0239       72.2455
117        #371422.00       46.5541       72.7283
118        #367404.00       50.3468       74.7026
119        #374872.00       48.6011       74.7945
120        #376131.00       48.3491       75.4658

Regards,
/Niels

--
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.

```