Torbjörn Granlund tg at gmplib.org
Sun Sep 8 13:08:06 UTC 2019

  > Should we rename div1 to put it into the GMP name space?  How about
  > mpn_div_11?  (We could encode in its name that it is for use for small
  > q, but I'd suggest that we don't do that.)
  > That would allow for some assembly experiments.

  If you think assembly will make a big difference (which seems likely),
  that makes sense. And define it only when it's faster than a a div
  instruction generated by the compiler?

I believe METHOD 1 could almost always be pointless if METHOD 3's
straight line code is made really efficient.

That should in particular be true if the q = 1 code is removed from
hgcd2, for machines with good multiplication throughput.

  $ ./tune/speed -c -p1000000 -s1 mpn_hgcd2_1 mpn_hgcd2_2 mpn_hgcd2_3
  overhead 6.00 cycles, precision 1000000 units of 8.33e-10 secs, CPU freq 1200.00 MHz
            mpn_hgcd2_1   mpn_hgcd2_2   mpn_hgcd2_3
  1             1492.62       2064.18      #1375.18

  So this is 50% speedup over the old method.

Not bad!  :-)

Comments on the patch and old code:

The normalisation code only works if operands are not already
normalised.  Cannot they be that?

Should be revive the commented-out div2?  It looks good to me, the
comment about that it is slower than the branchy div2 is probably from
the era where branchy div1/div2 was great.  (I suspect that to be when
Vax and m68020 was hot.)

I think there might be a slight bug or inefficiency here (from hgcd2.c):

      ASSERT (ah >= bh);

      ah -= bh;
      if (ah < (CNST_LIMB (1) << (GMP_LIMB_BITS / 2 + 1)))

      if (ah <= bh)
	  /* Use q = 1 */
	  u01 += u00;
	  u11 += u10;
	  mp_double_limb_t rq = div1 (ah, bh);

If, in the the last 'if' above ah = bh, we have as a matter of fact q =
2 (with zero remainder).  If I read the code right that will be
doscovered one iteration later.

If seem (ah < bh) is the correct criterion.

Please encrypt, key id 0xC8601622

More information about the gmp-devel mailing list