fast inversion

Mon May 18 09:33:24 UTC 2015

> Now we have:
> 
> @shell ~/gmp-repo$ tune/speed -s 1-1030 -f 2 -c mpn_neg mpn_com
> overhead 6.77 cycles, precision 10000 units of 2.86e-10 secs, CPU freq
> 3500.08 MHz
>               mpn_neg       mpn_com
> 1               #3.41         12.53
> 2               20.38        #13.64
> 4               20.39        #11.38
> 8               23.83        #16.02
> 16              30.63        #25.00
> 32              48.08        #44.81
> 64              88.81        #85.34
> 128           #170.27        208.85
> 256            382.19       #374.52
> 512            747.29       #735.20
> 1024          1480.86      #1472.57
> 
> The new code is faster for n==1, slower for 2 <= n <= 4, and faster (more
> than twice) for n >= 16.

great to see that mpn_neg has improved!

> > After a first glance to the code, two lines surprise me:
> >       mpn_com_n (tp, tp, n);
> >       mpn_add_1 (tp, tp, n, ONE);
> > I wondered why you didn't use
> >       mpn_neg_n (tp, tp, n);

should be mpn_neg instead? I have put this in
http://www.loria.fr/~zimmerma/papers/invert.c

> Anyway, in your code you should probably write:
>    mpn_com_n (tp + l, tp + l, h);
>    mpn_add_1 (tp + l, tp + l, h, mpn_zero_p (tp, l));

I don't see mpn_zero_p in the API of the current stable version 6.0.0
(according to gmplib.org). In which version will it be available?

Paul