hgcd1/2

Torbjörn Granlund tg at gmplib.org
Sat Sep 14 09:36:28 UTC 2019


nisse at lysator.liu.se (Niels Möller) writes:

  > I went ahead and committed that version, replacing the old
  > HGCD2_METHOD=2.  I expect it is be the fastest method on some platform.

  Will be interesting to see results on thresholds.

Nobody loves the new METHOD 2.  :-(

(Not many machines reported results, but some machines which I expected
could have wanted METHOD 2 did.)

We might as well switch default to METHOD 3 or 1.

  > (We might want to arrange for longlong.h to use lzcnt instead of bsr for
  > modern AMD processors; the initial two count_leading_zeros would
  > terminate in one cycle instead of 8 thereby!)

  Looks like you did that too.

Yes, and caused some widespread breakage with the mulx change I also
committed.  (A fix is ready.)

But at least one failure, with ivyfbsd64v12, is related to tuneup.c.

  I've now tried the similar #if:ed out div2 code, and enabling it gives
  an 8% speedup on my laptop.

Nice!

  Next, I think we should go ahead with the rename HGCD2_METHOD to
  DIV11_METHOD or possibly HGCD2_DIV1_METHOD.

Feel free.

-- 
Torbjörn
Please encrypt, key id 0xC8601622


More information about the gmp-devel mailing list