[PATCH 1/3] Optimize 32-bit sparc T1 multiply routines.

Torbjorn Granlund tg at gmplib.org
Tue Mar 5 13:42:58 CET 2013

David Miller <davem at davemloft.net> writes:

  	* mpn/sparc32/ultrasparct1/mul_1.asm (mpn_mul_1): Unroll main loop
  	one time, align code on 32-byte boundary, add T2/T3/T4 timings.
  	* mpn/sparc32/ultrasparct1/addmul_1.asm (mpn_addmul_1): Likewise.
  	* mpn/sparc32/ultrasparct1/submul_1.asm (mpn_submul_1): Likewise.

Thanks!  I will be pushing this to the public repo in a few moments.

Note that ALIGN between ASM_START and PROLOGUE is ineffective on this
platform.  If stricter alignment is needed for function starts (but not
loop starts?) then we need to override the default PROLOGUE_cpu.


More information about the gmp-devel mailing list