hgcd1/2

Niels Möller nisse at lysator.liu.se
Tue Sep 17 05:13:18 UTC 2019


nisse at lysator.liu.se (Niels Möller) writes:

> My guess is that special q=1 is often beneficial for the double-limb
> loop, and rarely beneficial in the single-limb loop. But needs some
> measurements.

I've made a quick try deleting it from the single-limb loop. See patch
below. Measurements are a bit noisy, but it looks like a slowdown when I
time it. With hgcd2 time increasing from 1220 cycles to 1290 (this time
measured on broadwell), which seems to be an increase of more than one
cycle per iteration of this loop.

Regards,
/Niels

diff -r f044264e2fe9 mpn/generic/hgcd2.c
--- a/mpn/generic/hgcd2.c	Mon Sep 16 22:18:22 2019 +0200
+++ b/mpn/generic/hgcd2.c	Tue Sep 17 06:46:38 2019 +0200
@@ -389,66 +389,44 @@
   /* Single precision loop */
   for (;;)
     {
+      mp_double_limb_t rq;
+      mp_limb_t q;
+
       ASSERT (ah >= bh);
 
-      ah -= bh;
-      if (ah < (CNST_LIMB (1) << (GMP_LIMB_BITS / 2 + 1)))
-	break;
-
-      if (ah <= bh)
-	{
-	  /* Use q = 1 */
-	  u01 += u00;
-	  u11 += u10;
-	}
-      else
-	{
-	  mp_double_limb_t rq = div1 (ah, bh);
-	  mp_limb_t q = rq.d1;
+      rq = div1 (ah, bh);
+      q = rq.d1;
 	  ah = rq.d0;
 
 	  if (ah < (CNST_LIMB(1) << (GMP_LIMB_BITS / 2 + 1)))
 	    {
-	      /* A is too small, but q is correct. */
+	  /* A is too small. */
+	  q--;
 	      u01 += q * u00;
 	      u11 += q * u10;
 	      break;
 	    }
-	  q++;
 	  u01 += q * u00;
 	  u11 += q * u10;
-	}
+
     subtract_a1:
       ASSERT (bh >= ah);
 
-      bh -= ah;
-      if (bh < (CNST_LIMB (1) << (GMP_LIMB_BITS / 2 + 1)))
-	break;
-
-      if (bh <= ah)
-	{
-	  /* Use q = 1 */
-	  u00 += u01;
-	  u10 += u11;
-	}
-      else
-	{
-	  mp_double_limb_t rq = div1 (bh, ah);
-	  mp_limb_t q = rq.d1;
+      rq = div1 (bh, ah);
+      q = rq.d1;
 	  bh = rq.d0;
 
 	  if (bh < (CNST_LIMB(1) << (GMP_LIMB_BITS / 2 + 1)))
 	    {
-	      /* B is too small, but q is correct. */
+	  /* B is too small. */
+	  q--;
 	      u00 += q * u01;
 	      u10 += q * u11;
 	      break;
 	    }
-	  q++;
 	  u00 += q * u01;
 	  u10 += q * u11;
 	}
-    }
 
  done:
   M->u[0][0] = u00; M->u[0][1] = u01;

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid 368C6677.
Internet email is subject to wholesale government surveillance.


More information about the gmp-devel mailing list