[PATCH 1/3] Optimize 32-bit sparc T1 multiply routines.
Torbjorn Granlund
tg at gmplib.org
Tue Mar 5 13:42:58 CET 2013
David Miller <davem at davemloft.net> writes:
* mpn/sparc32/ultrasparct1/mul_1.asm (mpn_mul_1): Unroll main loop
one time, align code on 32-byte boundary, add T2/T3/T4 timings.
* mpn/sparc32/ultrasparct1/addmul_1.asm (mpn_addmul_1): Likewise.
* mpn/sparc32/ultrasparct1/submul_1.asm (mpn_submul_1): Likewise.
Thanks! I will be pushing this to the public repo in a few moments.
Note that ALIGN between ASM_START and PROLOGUE is ineffective on this
platform. If stricter alignment is needed for function starts (but not
loop starts?) then we need to override the default PROLOGUE_cpu.
--
Torbjörn
More information about the gmp-devel
mailing list