mpn_invert implementing ApproximateReciprocal.

Torbjorn Granlund tg at gmplib.org
Fri Dec 11 23:48:51 CET 2009


Torbjorn Granlund <tg at gmplib.org> writes:

  Things look really great with it disabled:
  
            mpn_binvert    mpn_invert
  1        #0.000000003   0.000000009
  2        #0.000000004   0.000000013
  4        #0.000000007   0.000000022
  8        #0.000000017   0.000000043
  16       #0.000000056   0.000000113
  32       #0.000000192   0.000000362
  64       #0.000000672   0.000001076
  128      #0.000002453   0.000003193
  256      #0.000007628   0.000009004
  512      #0.000023650   0.000025329
  1024     #0.000068015   0.000068232
  2048     #0.000172617   0.000172637
  4096      0.000425522  #0.000422483
  8192      0.001026843  #0.001011517
  16384     0.015231000  #0.015171000
  32768     0.037683000  #0.036840000
  65536     0.083932000  #0.081554000
  131072    0.200767000  #0.180834000
  262144    0.435655000  #0.391770000
  524288    0.978450000  #0.934971000
  
  Here we see the power of mulmod_bnm1; binvert still uses "old-fashioned"
  FFT wrapping and is therefore beaten for larger operands.
  
I rewrote binvert to use mulmod_bnm1 too.  The old FFT wrapping code was
really nonsense.  I think the code is right now.

New timing:

1        #0.000000002   0.000000009
2        #0.000000004   0.000000013
4        #0.000000007   0.000000022
8        #0.000000017   0.000000043
16       #0.000000056   0.000000113
32       #0.000000192   0.000000368
64       #0.000000675   0.000001079
128      #0.000002474   0.000003239
256      #0.000007652   0.000009034
512      #0.000023715   0.000025460
1024     #0.000062779   0.000068647
2048     #0.000157147   0.000173785
4096     #0.000390786   0.000424830
8192     #0.000953684   0.001019611
16384    #0.015278000   0.015881000
32768    #0.034643000   0.035382000
65536    #0.070699000   0.080126000
131072   #0.177707000   0.181276000
262144   #0.383334000   0.399649000
524288   #0.905740000   0.935362000

It makes sense that binvert is epsilon faster than invert, except that
it is *much* faster for small sizes.

-- 
Torbjörn


More information about the gmp-devel mailing list