GCD project status?

Torbjörn Granlund tg at gmplib.org
Mon Sep 23 22:25:38 UTC 2019


nisse at lysator.liu.se (Niels Möller) writes:

  $ ./tune/speed -c -s1 -p100000 mpn_hgcd2_1 mpn_hgcd2_2 mpn_hgcd2_3 mpn_hgcd2_4 mpn_hgcd2_5 mpn_hgcd2_binary
  overhead 6.02 cycles, precision 100000 units of 8.33e-10 secs, CPU freq 1200.00 MHz
            mpn_hgcd2_1   mpn_hgcd2_2   mpn_hgcd2_3   mpn_hgcd2_4 mpn_hgcd2_5 mpn_hgcd2_binary
  1            #1668.90       1863.72       1670.73       1757.54     1738.50       2044.25

  Had a look at the disassembly for the binary algorithm. The
  double-precision loop needs, 20 instructions for just the conditional
  swap logic, 23 for the clz + shift + subtract, 8 for the shift+add
  updates of the u matrix.

Perhaps keeping the to-be-swapped variables in two structs, and instead
conditionally swap pointers to the structs?

Some measurements with method 4 and 5 are now in.  Modern Intel CPUs
like method 5, as I had expected.

-- 
Torbjörn
Please encrypt, key id 0xC8601622


More information about the gmp-devel mailing list