# mpn_invert implementing ApproximateReciprocal.

Torbjorn Granlund tg at gmplib.org
Mon Dec 14 19:56:03 CET 2009

```bodrato at mail.dm.unipi.it writes:

> Marco, I would commit this code if I were you.

I prefer not to commit, when I'm aware of a test failed by the new code.

I think that's a very wise principle!  :-)

And "try mpn_invert" did not work with the approximated inversion
replacing the current one.

> Perhaps we should have separate "mpn_invertappr" (following the naming
> scheme of divappr) and mpn_invert.  But I'd suggest we just let this be
> mpn_invert for now.

I strongly prefer the "two separate" functions option, and I implemented it.

I prefer that too, in particular as you are doing the work.,,

The code I'm going to commit provides, an "exact" function mpn_invert, and
a faster-but-approximated one: mpn_invertappr (with its sub-functions _ni
and _bc).

I hope the included comment is clean enough in explaining the difference.

/*
All the three functions mpn{,_bc,_ni}_invertappr (ip, dp, n, scratch),
take the strictly normalised value {dp,n} (i.e. most significant bit
must be set) as an input, and compute {ip,n}: the approximate
reciprocal of {dp,n}.

Let e = mpn*_invertappr (ip, dp, n, scratch) be the returned value; the
following conditions are satisfied by the output:
0 <= e <= 1;
{dp,n}*(B^n+{ip,n}) < B^{2n} <= {dp,n}*(B^n+{ip,n}+1+e) .
I.e. e=0 means that the result {ip,n} equals the one given by mpn_invert.
e=1 means that the result _may_ be one less than expected.

The _bc version returns e=1 most of the times.
The _ni version should return e=0 most of the time; only 1% of
possible random input should give e=1.

When the exact approximation is needed, i.e. e=0 in the relation above:
{dp,n}*(B^n+{ip,n}) < B^{2n} <= {dp,n}*(B^n+{ip,n}+1) ;
the function mpn_invert (ip, dp, n, scratch) should be used instead.
*/

"Exact approximation" is a bit odd.  The value computed is an
approximation (B^{2n}-1)/D, but is exactly floor((B^{2n}-1)/D).
I.e. no approximation compared to the ideal value.

As far as I can see BINV_MULMOD_BNM1_THRESHOLD is always set to zero, this
means that mulmod_bnm1 is always used when Newton is used... Great! But
the problem is that the threshold for Newton was measured with the slower
non-wrapping version of Newton.
The measured BINV_NEWTON_THRESHOLD is probably too high because of this.

You're right.  Swapping the measurement order is also not right, as the
code is written.  I don't see how to do this easily, unless we assume
mpn_mul will never be relevant here.  I think I'll do that for now.

For mpn_binvert's usage if mul vs mulmod_bmn1 I think we could actually
use a more generally aiming parameter MUL_TO_MULMOD_BNM1_2NXN_THRESHOLD.
(The sub_1 used in binvert will not matter.)  I'll leave that for later.

Thanks a lot for the invert.c contribution!!

--
Torbjörn
```