hgcd1/2

Niels Möller nisse at lysator.liu.se
Sat Sep 7 06:40:41 UTC 2019


tg at gmplib.org (Torbjörn Granlund) writes:

> We have now about half of hosts asking for HGCD_METHOD = 1 as for
> HGCD_METHOD = 3.  Itanic wants HGCD_METHOD 2, and so does beagle (Arm
> a8), but the latter with a preference of just 0.05% (over method 3).

I don't know much aboutitanic performance.

> _METHOD 1 benefits from the special handling of q = 1 before invoking
> DIV1.  _METHOD 2 & 3 are optimised for small q, so implicitly repeats
> the special handling of q = 1.

Would be interesting to try to take that out (except for _METHOD 2).

The current _METHOD 3 code starts with

  if (UNLIKELY ((d0 >> (GMP_LIMB_BITS - 3)) != 0)
      || UNLIKELY (n0 >= (d0 << 3)))

The first condition isn't that unlikely, though, and will get less
unlikely when we change hgcd2 to use div1 in some way also for the
double limb loop. Since the first few iterations of hgcd2 work with
numbers close to filling two limbs.

Coulds do the shifting in the other direction instead:

static inline mp_double_limb_t
div1 (mp_limb_t n0, mp_limb_t d0)
{
  mp_double_limb_t res;
  if (UNLIKELY ((n0 >> 3) >= d0))
    {
      res.d1 = n0 / d0;
      res.d0 = n0 - res.d1 * d0;
    }
  else
    {
      mp_limb_t q, t, mask;

      t = n0 >> 2;
      mask = -(mp_limb_t) (t >= d0);

      q = 4 & mask;
      n0 -= (d0 << 2) & mask;

      t = n0 >> 1;
      mask = -(mp_limb_t) (t >= d0);
      q += 2 & mask;
      n0 -= (d0 << 1) & mask;

      mask = -(mp_limb_t) (n0 >= d0);
      q += 1 & mask;
      n0 -= d0 & mask;

      res.d0 = n0;
      res.d1 = q;
    }
  return res;
}

It has the drawback that it adds dependencies on the n0 path, so not
sure if it can be faster.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.


More information about the gmp-devel mailing list