# mpz reuse test takes too much time

Niels Möller nisse at lysator.liu.se
Sun Jan 1 20:03:27 UTC 2017

```nisse at lysator.liu.se (Niels Möller) writes:

> In the case
>
>   mpn_gcdext(G, S, NULL, A, B)
>
> with size(A) < size(B), IIRC we currently handle that by swapping
> arguments, calling mpn_gcdext to compute T (even though the caller
> didn't need that, and produce S with an multiplication using multiply +
> exact division.

I've looked a bit more at this, and it's not at clear to me what
strategy one should use.

The cofactors satisfy

G = S A + T B,

where S is the same order of magnitude as B/G, and T is of the same
order of magnitude as A/G. For simplicity, start with the case of small
G so we don't have to care about its effect on the sizes, and assume
that B is n limbs, A is m limbs, and n > m. Then we'd also expect S to
be n limbs and T to be m limbs.

Now we have two possible strategies:

1. Call mpn_gcdext (B, A), which produces T. And compute S as

S = (G - T B) / A

that's one n x m multiply for T B, and one (n + m) / m exact division.

2. First do a division up front,

B = Q A + R

and call mpn_gcdext(A, R). We then get a cofactor S', of m limbs. But
we also need the other cofactor T', because the wanted cofactor S is

S = S' - Q T'

That's one (n - m) x m multiply. To get T', we have to use

T' = (G - S' A) / R

Which is a m x m multiply and an 2m / m exact division. I had a quick
look at also substituting Q = (B - R)/A into the equations, but that

(And we don't count the cost of the initial division, bacause it will
be needed for strategy (1) too, done by mpn_gcdext).

So the question is, which cost is smallest,

MUL(n, m) + DIVE(n + m, m), or

MUL(m, m) + MUL(n - m, m) + DIVE(2m, m) ?

MUL(n, m) < MUL(m, m) + MUL(n-m, m)

or am I confused? But DIVE(n + m, m) > DIVE(2m, m), since the quotient
is larger.

And a third strategy could be to extend mpn_gcdext to support A < B, and
hence compute the larger cofactor directly. But I'd guess that would be
more work, since gcdext uses a quadratic algorithm for much larger sizes
than for multiplication and division.

Regards,
/Niels

--
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
```