update on mulmid-based inversion

David Harvey dmharvey at cims.nyu.edu
Sat Jan 9 05:43:33 CET 2010


Hi,

Congratulations on the GMP 5 release. I had only been expecting GMP  
4.4, this is a nice new year present :-)

I updated my mulmid-based inversion code for GMP 5, using the new  
mpn_invertappr for the base case. Results are still looking good.  
Table below shows cycles for inversion for 6 <= n <= 5000 on K8.  
Summary: new code starts to win at around n = 20, it's roughly 10-15%  
faster than mpn_invertappr up to about n = 2000, then falls behind  
again. I'll post the code on my website soon.

       | invert_appr | invert_mp | ratio
     6 | 6.290e+02   | 7.910e+02 | 1.258
     7 | 7.670e+02   | 8.020e+02 | 1.046
     8 | 8.080e+02   | 9.050e+02 | 1.120
     9 | 9.230e+02   | 9.600e+02 | 1.040
    10 | 9.620e+02   | 1.101e+03 | 1.144
    11 | 1.126e+03   | 1.141e+03 | 1.013
    12 | 1.141e+03   | 1.285e+03 | 1.126
    13 | 1.317e+03   | 1.339e+03 | 1.017
    14 | 1.394e+03   | 1.494e+03 | 1.072
    15 | 1.581e+03   | 1.530e+03 | 0.968
    16 | 1.653e+03   | 1.698e+03 | 1.027
    17 | 1.716e+03   | 1.730e+03 | 1.008
    18 | 1.922e+03   | 1.893e+03 | 0.985
    19 | 2.135e+03   | 1.973e+03 | 0.924
    20 | 2.219e+03   | 2.138e+03 | 0.963
    21 | 2.217e+03   | 2.200e+03 | 0.992
    22 | 2.367e+03   | 2.372e+03 | 1.002
    23 | 2.533e+03   | 2.475e+03 | 0.977
    24 | 2.789e+03   | 2.650e+03 | 0.950
    25 | 3.175e+03   | 2.742e+03 | 0.864
    27 | 3.290e+03   | 2.960e+03 | 0.900
    28 | 3.352e+03   | 3.207e+03 | 0.957
    30 | 3.637e+03   | 3.502e+03 | 0.963
    31 | 3.772e+03   | 3.613e+03 | 0.958
    33 | 4.078e+03   | 3.855e+03 | 0.945
    34 | 4.399e+03   | 4.069e+03 | 0.925
    36 | 4.957e+03   | 4.451e+03 | 0.898
    38 | 5.123e+03   | 4.752e+03 | 0.928
    40 | 5.508e+03   | 5.100e+03 | 0.926
    42 | 5.884e+03   | 5.320e+03 | 0.904
    44 | 6.260e+03   | 5.710e+03 | 0.912
    46 | 6.650e+03   | 6.081e+03 | 0.914
    48 | 7.053e+03   | 6.566e+03 | 0.931
    51 | 7.686e+03   | 6.996e+03 | 0.910
    53 | 7.959e+03   | 7.508e+03 | 0.943
    56 | 8.786e+03   | 8.150e+03 | 0.928
    59 | 9.481e+03   | 8.852e+03 | 0.934
    62 | 1.025e+04   | 9.343e+03 | 0.912
    65 | 1.088e+04   | 1.029e+04 | 0.946
    68 | 1.141e+04   | 1.092e+04 | 0.957
    72 | 1.258e+04   | 1.207e+04 | 0.960
    75 | 1.350e+04   | 1.324e+04 | 0.981
    79 | 1.460e+04   | 1.457e+04 | 0.998
    83 | 1.546e+04   | 1.560e+04 | 1.009
    87 | 1.666e+04   | 1.672e+04 | 1.003
    92 | 1.842e+04   | 1.772e+04 | 0.962
    96 | 1.938e+04   | 1.930e+04 | 0.996
   101 | 2.125e+04   | 2.068e+04 | 0.973
   106 | 2.305e+04   | 2.164e+04 | 0.939
   112 | 2.484e+04   | 2.415e+04 | 0.972
   117 | 2.705e+04   | 2.569e+04 | 0.950
   123 | 2.936e+04   | 2.791e+04 | 0.950
   129 | 3.174e+04   | 3.058e+04 | 0.963
   136 | 3.424e+04   | 3.264e+04 | 0.953
   143 | 3.768e+04   | 3.578e+04 | 0.949
   150 | 4.080e+04   | 3.852e+04 | 0.944
   157 | 4.407e+04   | 4.239e+04 | 0.962
   165 | 4.745e+04   | 4.588e+04 | 0.967
   173 | 5.152e+04   | 4.916e+04 | 0.954
   182 | 5.616e+04   | 5.231e+04 | 0.931
   191 | 6.131e+04   | 5.777e+04 | 0.942
   201 | 6.676e+04   | 6.196e+04 | 0.928
   211 | 7.273e+04   | 6.518e+04 | 0.896
   221 | 7.890e+04   | 7.157e+04 | 0.907
   232 | 8.683e+04   | 7.495e+04 | 0.863
   244 | 9.409e+04   | 8.118e+04 | 0.863
   256 | 1.058e+05   | 8.790e+04 | 0.831
   269 | 1.113e+05   | 9.594e+04 | 0.862
   283 | 1.223e+05   | 1.031e+05 | 0.843
   297 | 1.310e+05   | 1.123e+05 | 0.857
   312 | 1.438e+05   | 1.219e+05 | 0.847
   327 | 1.520e+05   | 1.307e+05 | 0.860
   344 | 1.677e+05   | 1.408e+05 | 0.839
   361 | 1.770e+05   | 1.502e+05 | 0.849
   379 | 1.940e+05   | 1.637e+05 | 0.844
   398 | 2.066e+05   | 1.768e+05 | 0.856
   418 | 2.225e+05   | 1.829e+05 | 0.822
   439 | 2.376e+05   | 2.022e+05 | 0.851
   461 | 2.579e+05   | 2.229e+05 | 0.864
   484 | 2.857e+05   | 2.349e+05 | 0.822
   508 | 3.003e+05   | 2.563e+05 | 0.853
   534 | 3.279e+05   | 2.725e+05 | 0.831
   560 | 3.518e+05   | 2.939e+05 | 0.836
   588 | 3.714e+05   | 3.192e+05 | 0.859
   618 | 4.060e+05   | 3.463e+05 | 0.853
   649 | 4.364e+05   | 3.764e+05 | 0.862
   681 | 4.749e+05   | 4.025e+05 | 0.847
   715 | 5.051e+05   | 4.299e+05 | 0.851
   751 | 5.454e+05   | 4.676e+05 | 0.857
   789 | 5.957e+05   | 5.082e+05 | 0.853
   828 | 6.200e+05   | 5.443e+05 | 0.878
   869 | 6.684e+05   | 5.729e+05 | 0.857
   913 | 7.233e+05   | 6.263e+05 | 0.866
   959 | 7.839e+05   | 6.921e+05 | 0.883
  1006 | 8.351e+05   | 7.267e+05 | 0.870
  1057 | 9.022e+05   | 7.913e+05 | 0.877
  1110 | 9.536e+05   | 8.405e+05 | 0.881
  1165 | 1.021e+06   | 9.029e+05 | 0.884
  1224 | 1.105e+06   | 9.705e+05 | 0.878
  1285 | 1.203e+06   | 1.052e+06 | 0.874
  1349 | 1.287e+06   | 1.129e+06 | 0.877
  1416 | 1.364e+06   | 1.193e+06 | 0.874
  1487 | 1.476e+06   | 1.318e+06 | 0.893
  1562 | 1.586e+06   | 1.404e+06 | 0.885
  1640 | 1.648e+06   | 1.515e+06 | 0.919
  1722 | 1.773e+06   | 1.627e+06 | 0.918
  1808 | 1.914e+06   | 1.738e+06 | 0.908
  1898 | 2.018e+06   | 1.887e+06 | 0.935
  1993 | 2.177e+06   | 2.020e+06 | 0.928
  2093 | 2.314e+06   | 2.204e+06 | 0.952
  2198 | 2.444e+06   | 2.335e+06 | 0.955
  2308 | 2.626e+06   | 2.516e+06 | 0.958
  2423 | 2.709e+06   | 2.812e+06 | 1.038
  2544 | 2.978e+06   | 3.015e+06 | 1.012
  2671 | 3.158e+06   | 3.269e+06 | 1.035
  2805 | 3.331e+06   | 3.478e+06 | 1.044
  2945 | 3.706e+06   | 3.715e+06 | 1.002
  3093 | 3.948e+06   | 3.948e+06 | 1.000
  3247 | 4.130e+06   | 4.280e+06 | 1.036
  3410 | 4.487e+06   | 4.557e+06 | 1.016
  3580 | 4.603e+06   | 4.973e+06 | 1.080
  3759 | 5.056e+06   | 5.364e+06 | 1.061
  3947 | 5.463e+06   | 5.793e+06 | 1.061
  4144 | 5.857e+06   | 6.256e+06 | 1.068
  4352 | 6.183e+06   | 6.766e+06 | 1.094
  4569 | 6.396e+06   | 7.288e+06 | 1.140
  4798 | 6.850e+06   | 7.983e+06 | 1.165

david



More information about the gmp-devel mailing list