# gcdext_1 questions

Niels Möller nisse at lysator.liu.se
Thu Dec 3 14:14:28 CET 2009

```Torbjorn Granlund <tg at gmplib.org> writes:

> If we decide to make public a function "mpn_gcdext_1" than it should
> probably take {p,n} and a limb as input arguments.  If it takes just two
> limbs, the _1 scheme does not apply.
>
> But that's not actually your question, I suppose.

That's one of the question; gcdext_1 seems like a generally useful
function, and then we should have a public interface to it. Or two
interfaces, maybe.

> OK, so two things change, the input type of the u argument and whether
> the 2nd cofactor is generated.

Exactly.

> But the 2nd cofactor arg could be made large, of course?  (I assume it
> would be non-trivial to avoid quadratic complexity if the 2nd cofactor
> where iteratively updated, therefore it will be computed at the end.)

To compute the large factor, one natural way is to store the quotient
for the initial division (which is a large cost given that mod_1 is much
faster then div_qr_1). Then do a gcdext of two single limb numbers,
producing two single limb cofactors. And then these are combined with
the large quotient to give the large cofactor, an O(n) operation.

>   And in case v is even, one could either compute (g - s u) as a double
>   word number, ...
>
> It is not immediately evident that this is a double-word number.

In the general case, u is a full limb and s is an almost full limb. Then
two limbs are needed to represent s * u.

> The divisions are the problem, let's not worry about multiplications by
> q!

Noted.

> Was the question "can the binary algorithm be coerced into computing the
> canonical cofactors?"?

Precisely. One way to arrange the binary algorithm is to work with
matrices such that

(U;V) = M (u;v)   where det M = 2^k

where U, V are the inputs and u,v are reduced numbers. Stop when u = v =
gcd(U,V). Say M = (a,b;c,d). We get more or less for free that

U / g = a + b, V / g = c + d (that's true also for Euclid, I think)

and inverting M we get

g = (d U - b V) / 2^k = (-c U + a V) / 2^k

The cofactor s is given by

d * 2^-k (mod V/g)

which can be computed by k rounds in a loop which sets either d <- d/2
or d <- (d + (V/g) ) / 2 depending on the least significant bit of d.
(this is essentially redc so another way is to compute a partial inverse
of V/g, byt table lookup, and reduce several bits at a time).

There's some more book-keeping needed (including handling common powers
of two up front), but I think it makes some sense, and it could
hopefully be done without any branches inside the loops.

Regards,
/Niels
--
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.
```