Performance of addlsh_n and sublsh_n
Rick Hodgin
foxmuldrster at yahoo.com
Thu Feb 3 08:10:17 CET 2011
Sandy Bridge has a flaw. No wonder it's sometimes faster. It's taking short-cuts. :-)
http://www.tomshardware.com/news/sandy-bridge-cougar-point-chipset-sandy-bridge-recall,12123.html
- Rick C. Hodgin
--- On Wed, 2/2/11, Torbjorn Granlund <tg at gmplib.org> wrote:
From: Torbjorn Granlund <tg at gmplib.org>
Subject: Performance of addlsh_n and sublsh_n
To: gmp-devel at gmplib.org
Date: Wednesday, February 2, 2011, 2:14 PM
On AMD K8, K9, K10, and Intel Sandy Bridge, addlsh_n and sublsh_n are
slower than addmul_1 and submul_1. The latters' functionality
completely cover the formers' functionality, except that
addmul_1/submul_1 do not allow separate lsh source operand and
destination operand.
Futhermore, on Intel Core2 and Nehalem, addlsh_n is slower than add_n
plus lshift, but the former presumably become faster when operands are
too large to fit in L1 cache.
We need to speed up addlsh_n and sublsh_n, or disable them for several
processors.
--
Torbjörn
_______________________________________________
gmp-devel mailing list
gmp-devel at gmplib.org
https://gmplib.org/mailman/listinfo/gmp-devel
More information about the gmp-devel
mailing list