Performance of addlsh_n and sublsh_n
Torbjorn Granlund
tg at gmplib.org
Wed Feb 2 20:14:56 CET 2011
On AMD K8, K9, K10, and Intel Sandy Bridge, addlsh_n and sublsh_n are
slower than addmul_1 and submul_1. The latters' functionality
completely cover the formers' functionality, except that
addmul_1/submul_1 do not allow separate lsh source operand and
destination operand.
Futhermore, on Intel Core2 and Nehalem, addlsh_n is slower than add_n
plus lshift, but the former presumably become faster when operands are
too large to fit in L1 cache.
We need to speed up addlsh_n and sublsh_n, or disable them for several
processors.
--
Torbjörn
More information about the gmp-devel
mailing list