hgcd1/2
Torbjörn Granlund
tg at gmplib.org
Tue Sep 3 13:22:36 UTC 2019
I tested newer Intel systems too (Haswell, Skylake) and they all need
around 25 cycles for a division n/d = 1.
Intel Goldmont Plus (a current low-end CPU) is better, it needs about 12
cycles. AMD CPUs from the last 10 years all perform OK.
It is funny that x86 vendors give division so little thought. ARM
clearly got it right. I mean, doing SRT for just the non-zero part of
the quotient cannot be very hard!
(ARM processors before a77 have very poor multiplication, though.)
AMD bd1 22
AMD bd2 15
AMD bd4 15
AMD zn1 14
AMD zn2 14
AMD bt2 13
Intel hwl 25
Intel sky 25
Intel slm 30
Intel glm 13
Intel glm+ 12
--
Torbjörn
Please encrypt, key id 0xC8601622
More information about the gmp-devel
mailing list