hgcd1/2
Torbjörn Granlund
tg at gmplib.org
Tue Sep 3 08:49:52 UTC 2019
nisse at lysator.liu.se (Niels Möller) writes:
And this on a laptop with an Intel U4100 (5 years old?), so I'd assume
it doesn't have a particularly fast div instruction. Should we just
delete div1 ? On which architectures can we expect it to be beneficial?
It should be fairly easy to find out, if we define a HGCD_DIV1_METHOD
known to tuneup, to select between plain division and the div1 function.
Interesting but not too surprising results.
Intel ark doesn't seem to know any processor called "U4100" so I cannot
figure out what generation it belongs to.
IIRC, Intel has not improved plain 64b/64b division since Haswell, which
is older than 5 years.
Again, if IIRC, small quotients may result in 16 cycle latency. That's
the lowest possible timing.
--
Torbjörn
Please encrypt, key id 0xC8601622
More information about the gmp-devel
mailing list