[PATCH] Optimize 32-bit sparc T1 multiply routines.

Torbjorn Granlund tg at gmplib.org
Fri Jan 18 21:22:00 CET 2013

David Miller <davem at davemloft.net> writes:

  While waiting for the FSF to execute my assignment, I tweaked my
  existing 2-way unrolled mul_1 and addmul_1 loops.  Currently on T4 I'm
  	mul_1		3.8 cycles/limb
  	addmul_1	5.5 cycles/limb
Nice progress!  I still recommend 4-way unrolling for at least the most
critical functions.  :-)


More information about the gmp-devel mailing list