update on mulmid-based inversion
David Harvey
dmharvey at cims.nyu.edu
Sat Jan 9 05:43:33 CET 2010
Hi,
Congratulations on the GMP 5 release. I had only been expecting GMP
4.4, this is a nice new year present :-)
I updated my mulmid-based inversion code for GMP 5, using the new
mpn_invertappr for the base case. Results are still looking good.
Table below shows cycles for inversion for 6 <= n <= 5000 on K8.
Summary: new code starts to win at around n = 20, it's roughly 10-15%
faster than mpn_invertappr up to about n = 2000, then falls behind
again. I'll post the code on my website soon.
| invert_appr | invert_mp | ratio
6 | 6.290e+02 | 7.910e+02 | 1.258
7 | 7.670e+02 | 8.020e+02 | 1.046
8 | 8.080e+02 | 9.050e+02 | 1.120
9 | 9.230e+02 | 9.600e+02 | 1.040
10 | 9.620e+02 | 1.101e+03 | 1.144
11 | 1.126e+03 | 1.141e+03 | 1.013
12 | 1.141e+03 | 1.285e+03 | 1.126
13 | 1.317e+03 | 1.339e+03 | 1.017
14 | 1.394e+03 | 1.494e+03 | 1.072
15 | 1.581e+03 | 1.530e+03 | 0.968
16 | 1.653e+03 | 1.698e+03 | 1.027
17 | 1.716e+03 | 1.730e+03 | 1.008
18 | 1.922e+03 | 1.893e+03 | 0.985
19 | 2.135e+03 | 1.973e+03 | 0.924
20 | 2.219e+03 | 2.138e+03 | 0.963
21 | 2.217e+03 | 2.200e+03 | 0.992
22 | 2.367e+03 | 2.372e+03 | 1.002
23 | 2.533e+03 | 2.475e+03 | 0.977
24 | 2.789e+03 | 2.650e+03 | 0.950
25 | 3.175e+03 | 2.742e+03 | 0.864
27 | 3.290e+03 | 2.960e+03 | 0.900
28 | 3.352e+03 | 3.207e+03 | 0.957
30 | 3.637e+03 | 3.502e+03 | 0.963
31 | 3.772e+03 | 3.613e+03 | 0.958
33 | 4.078e+03 | 3.855e+03 | 0.945
34 | 4.399e+03 | 4.069e+03 | 0.925
36 | 4.957e+03 | 4.451e+03 | 0.898
38 | 5.123e+03 | 4.752e+03 | 0.928
40 | 5.508e+03 | 5.100e+03 | 0.926
42 | 5.884e+03 | 5.320e+03 | 0.904
44 | 6.260e+03 | 5.710e+03 | 0.912
46 | 6.650e+03 | 6.081e+03 | 0.914
48 | 7.053e+03 | 6.566e+03 | 0.931
51 | 7.686e+03 | 6.996e+03 | 0.910
53 | 7.959e+03 | 7.508e+03 | 0.943
56 | 8.786e+03 | 8.150e+03 | 0.928
59 | 9.481e+03 | 8.852e+03 | 0.934
62 | 1.025e+04 | 9.343e+03 | 0.912
65 | 1.088e+04 | 1.029e+04 | 0.946
68 | 1.141e+04 | 1.092e+04 | 0.957
72 | 1.258e+04 | 1.207e+04 | 0.960
75 | 1.350e+04 | 1.324e+04 | 0.981
79 | 1.460e+04 | 1.457e+04 | 0.998
83 | 1.546e+04 | 1.560e+04 | 1.009
87 | 1.666e+04 | 1.672e+04 | 1.003
92 | 1.842e+04 | 1.772e+04 | 0.962
96 | 1.938e+04 | 1.930e+04 | 0.996
101 | 2.125e+04 | 2.068e+04 | 0.973
106 | 2.305e+04 | 2.164e+04 | 0.939
112 | 2.484e+04 | 2.415e+04 | 0.972
117 | 2.705e+04 | 2.569e+04 | 0.950
123 | 2.936e+04 | 2.791e+04 | 0.950
129 | 3.174e+04 | 3.058e+04 | 0.963
136 | 3.424e+04 | 3.264e+04 | 0.953
143 | 3.768e+04 | 3.578e+04 | 0.949
150 | 4.080e+04 | 3.852e+04 | 0.944
157 | 4.407e+04 | 4.239e+04 | 0.962
165 | 4.745e+04 | 4.588e+04 | 0.967
173 | 5.152e+04 | 4.916e+04 | 0.954
182 | 5.616e+04 | 5.231e+04 | 0.931
191 | 6.131e+04 | 5.777e+04 | 0.942
201 | 6.676e+04 | 6.196e+04 | 0.928
211 | 7.273e+04 | 6.518e+04 | 0.896
221 | 7.890e+04 | 7.157e+04 | 0.907
232 | 8.683e+04 | 7.495e+04 | 0.863
244 | 9.409e+04 | 8.118e+04 | 0.863
256 | 1.058e+05 | 8.790e+04 | 0.831
269 | 1.113e+05 | 9.594e+04 | 0.862
283 | 1.223e+05 | 1.031e+05 | 0.843
297 | 1.310e+05 | 1.123e+05 | 0.857
312 | 1.438e+05 | 1.219e+05 | 0.847
327 | 1.520e+05 | 1.307e+05 | 0.860
344 | 1.677e+05 | 1.408e+05 | 0.839
361 | 1.770e+05 | 1.502e+05 | 0.849
379 | 1.940e+05 | 1.637e+05 | 0.844
398 | 2.066e+05 | 1.768e+05 | 0.856
418 | 2.225e+05 | 1.829e+05 | 0.822
439 | 2.376e+05 | 2.022e+05 | 0.851
461 | 2.579e+05 | 2.229e+05 | 0.864
484 | 2.857e+05 | 2.349e+05 | 0.822
508 | 3.003e+05 | 2.563e+05 | 0.853
534 | 3.279e+05 | 2.725e+05 | 0.831
560 | 3.518e+05 | 2.939e+05 | 0.836
588 | 3.714e+05 | 3.192e+05 | 0.859
618 | 4.060e+05 | 3.463e+05 | 0.853
649 | 4.364e+05 | 3.764e+05 | 0.862
681 | 4.749e+05 | 4.025e+05 | 0.847
715 | 5.051e+05 | 4.299e+05 | 0.851
751 | 5.454e+05 | 4.676e+05 | 0.857
789 | 5.957e+05 | 5.082e+05 | 0.853
828 | 6.200e+05 | 5.443e+05 | 0.878
869 | 6.684e+05 | 5.729e+05 | 0.857
913 | 7.233e+05 | 6.263e+05 | 0.866
959 | 7.839e+05 | 6.921e+05 | 0.883
1006 | 8.351e+05 | 7.267e+05 | 0.870
1057 | 9.022e+05 | 7.913e+05 | 0.877
1110 | 9.536e+05 | 8.405e+05 | 0.881
1165 | 1.021e+06 | 9.029e+05 | 0.884
1224 | 1.105e+06 | 9.705e+05 | 0.878
1285 | 1.203e+06 | 1.052e+06 | 0.874
1349 | 1.287e+06 | 1.129e+06 | 0.877
1416 | 1.364e+06 | 1.193e+06 | 0.874
1487 | 1.476e+06 | 1.318e+06 | 0.893
1562 | 1.586e+06 | 1.404e+06 | 0.885
1640 | 1.648e+06 | 1.515e+06 | 0.919
1722 | 1.773e+06 | 1.627e+06 | 0.918
1808 | 1.914e+06 | 1.738e+06 | 0.908
1898 | 2.018e+06 | 1.887e+06 | 0.935
1993 | 2.177e+06 | 2.020e+06 | 0.928
2093 | 2.314e+06 | 2.204e+06 | 0.952
2198 | 2.444e+06 | 2.335e+06 | 0.955
2308 | 2.626e+06 | 2.516e+06 | 0.958
2423 | 2.709e+06 | 2.812e+06 | 1.038
2544 | 2.978e+06 | 3.015e+06 | 1.012
2671 | 3.158e+06 | 3.269e+06 | 1.035
2805 | 3.331e+06 | 3.478e+06 | 1.044
2945 | 3.706e+06 | 3.715e+06 | 1.002
3093 | 3.948e+06 | 3.948e+06 | 1.000
3247 | 4.130e+06 | 4.280e+06 | 1.036
3410 | 4.487e+06 | 4.557e+06 | 1.016
3580 | 4.603e+06 | 4.973e+06 | 1.080
3759 | 5.056e+06 | 5.364e+06 | 1.061
3947 | 5.463e+06 | 5.793e+06 | 1.061
4144 | 5.857e+06 | 6.256e+06 | 1.068
4352 | 6.183e+06 | 6.766e+06 | 1.094
4569 | 6.396e+06 | 7.288e+06 | 1.140
4798 | 6.850e+06 | 7.983e+06 | 1.165
david
More information about the gmp-devel
mailing list