Possible new T3-T5 mul_1
Torbjorn Granlund
tg at gmplib.org
Tue Apr 2 21:59:17 CEST 2013
David Miller <davem at davemloft.net> writes:
> main: sethi %hi(2800000000), %g5
> 1: mulx %g1, %g1, %g1
> mulx %g1, %g1, %g1
> mulx %g1, %g1, %g1
> mulx %g1, %g1, %g1
> brnz %g5, 1b
> dec %g5
> retl
> nop
This runs in 47.234 seconds.
I'd like to understand carry flag renaming,
which we will need for fast addmul_k.
Does things like this run well, i.e., ideally at 5 cycles?
Some chips performs great register renaming, except that
the carry bit is really just one bit.
.global main
main: save %sp, -176, %sp
sethi %hi(2800000000), %g5
1: addcc %g7, %g7, %l0
addxccc %g7, %g7, %l1
addxccc %g7, %g7, %l2
addxccc %g7, %g7, %l3
addcc %g7, %g7, %l4
addxccc %g7, %g7, %l5
addxccc %g7, %g7, %l6
addxccc %g7, %g7, %l7
brnz %g5, 1b
dec %g5
ret
restore
--
Torbjörn
More information about the gmp-devel
mailing list