GMP 4.3 multiplication performance

Torbjorn Granlund tg at
Sat Jun 6 20:20:35 CEST 2009

Peter Cordes <peter at> writes:

  On Wed, Jun 03, 2009 at 12:09:21PM +0200, Torbjorn Granlund wrote:
  > This particular macro is not currently problem free.  Let's compare
  > lshift+add_n and submul_1 on some machines:
  >              lshift+sub_n            submul_1
  > athlon64        3.87                     2.5
  > core2           3.3                      4.5
  > pentium4/64     7.3                     14.9
  > ultrasparc 3    7.75                    23
  > power4/ppc970   5.0                     10
   Have you incorporated my mpn_rshift improvements for Core2?  (and
  sped up lshift the same way?)

No, I don't think so.  I recall we had a long conversation about it last
year, but the code used now is written by me, I think.  It runs at 1.25

We should be careful with performance for small operand sizes if we
attempt to shave off more cycle fractions from this, since small
operands are very command.


