fast inversion

Mon Apr 27 14:45:48 UTC 2015

bodrato at mail.dm.unipi.it writes:

  After a first glance to the code, two lines surprise me:
        mpn_com_n (tp, tp, n);
        mpn_add_1 (tp, tp, n, ONE);
  I wondered why you didn't use
        mpn_neg_n (tp, tp, n);
  Then I tested (on shell at gmplib) and...

  @shell ~/gmp-repo$ tune/speed -s 1-1030 -f 2 -c mpn_neg mpn_com
  mpn_add_1_inplace.1
  overhead 6.78 cycles, precision 10000 units of 2.86e-10 secs, CPU freq
  3500.08 MHz
                mpn_neg       mpn_com mpn_add_1_inplace.1
  1               #5.68         12.54          6.80
  2                9.40         13.65         #8.19
  4               16.25         11.40         #8.22
  8               31.56         16.01         #6.84
  16              61.86         25.10         #8.16
  32             139.01         44.79         #6.80
  64             248.18         85.51         #8.20
  128            472.77        206.21         #8.38
  256            918.75        372.29         #8.21
  512           1915.83        731.53         #6.87
  1024          3689.67       1472.14         #8.29

  ...of course you are right. On many architectures we HAVE_NATIVE_mpn_com,
  which is faster than the C loop when sizes grow.

Perhaps mpn_neg should use a native mpn_com...

-- 
Torbjörn
Please encrypt, key id 0xC8601622