GMP 4.3 multiplication performance

Torbjorn Granlund tg at
Sat Jun 6 20:20:35 CEST 2009

Peter Cordes <peter at> writes:

  On Wed, Jun 03, 2009 at 12:09:21PM +0200, Torbjorn Granlund wrote:
  > This particular macro is not currently problem free.  Let's compare
  > lshift+add_n and submul_1 on some machines:
  >              lshift+sub_n            submul_1
  > athlon64        3.87                     2.5
  > core2           3.3                      4.5
  > pentium4/64     7.3                     14.9
  > ultrasparc 3    7.75                    23
  > power4/ppc970   5.0                     10
   Have you incorporated my mpn_rshift improvements for Core2?  (and
  sped up lshift the same way?)

No, I don't think so.  I recall we had a long conversation about it last
year, but the code used now is written by me, I think.  It runs at 1.25

We should be careful with performance for small operand sizes if we
attempt to shave off more cycle fractions from this, since small
operands are very command.


More information about the gmp-devel mailing list