# General mpn_gcd_basecase

Niels Möller nisse at lysator.liu.se
Fri Aug 30 07:25:07 UTC 2019

```tg at gmplib.org (Torbjörn Granlund) writes:

> I switched focus to a general mpn_gcd_basecase, using the table-based
> approach Niels and me discussed a week ot two ago here.
>
> I decided to still keep away from data-dependent branches, which may not
> be the best solution as the operands grow.
>
> Ignoring unlikely cases, we have this loop:
>
>  while (n > 2)
>     {
>       CMP_SWAP (up, vp, n);
>
>       a = up[0];
>       b = vp[0];
>       m = tab[((a & mask) << (4 - 1)) + ((b & mask) >> 1)];

If we allow data dependent branches, would it be unreasonable to go all
the way and make the table a jump table, something like

switch (((a & mask) << (4 - 1)) + ((b & mask) >> 1)) { ...}

?

If we can afford a fair amount of different cases, I find the variant
choosing between alternatives  a±b, a±3b, (a, b) <-- (3 a - 5b, 3a +
5b) etc neat. Shifting at least 4 bits per iteration.

It's tempting to use two's complement to deal with negative values.
I see two complications:

1. We need an efficient way to find out which of |u| and |v| is
smallest, for CMP_SWAP. But we can likely get away with something a
bit sloppy, and accept arbitrary results in the case that u and v are
"close", e.g., same bit size.

2. For the variant without branches, we'd need an mpn_addmul_1s working
with two's complement. It could be implemented as mpn_addmul_1 +
mpn_cnd_sub, but not sure if there's some easy way to implement with
a single pass, somehow sign extending on the fly.

>       cy = mpn_addmul_1 (up, vp, n, m);
>       a = up[0];
>       count_trailing_zeros (cnt, a);

One could move the count_trailing_zeros before mpn_addmul_1 by
duplicating the mul, a = up[0] + m * vp[0]. Would make more sense if we
also have an addmul + shift loop.

>       mpn_rshift (up, up, n, cnt);
>       up[n - 1] |= cy << (GMP_LIMB_BITS - cnt);
>
>       if (mpn_cmp (up, vp, n) == 0)
>          we're done

The large gcd exit case belongs inside the if (UNLIKELY (a == 0)) branch
needed for count_trailing_zeros.

Regards,
/Niels

--
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.
```