[PATCH] Optimize 32-bit sparc T1 multiply routines.
David Miller
davem at davemloft.net
Sun Jan 6 06:00:24 CET 2013
From: Torbjorn Granlund <tg at gmplib.org>
Date: Sun, 06 Jan 2013 01:25:10 +0100
> For sub_n, I suppose
>
> ldx
> ldx
> xnor (with %g0)
> addxcc
> stx
>
> would be the right mix.
I must be dense, but the implementation below doesn't work:
PROLOGUE(mpn_sub_nc)
b,a L(ent)
EPILOGUE()
PROLOGUE(mpn_sub_n)
mov 0, cy
L(ent): cmp %g0, cy
L(top): ldx [up+0], %o4
add up, 8, up
ldx [vp+0], %o5
add vp, 8, vp
add rp, 8, rp
add n, -1, n
xnor %o5, %g0, %o5
addxccc %o4, %o5, %g3
brgz n, L(top)
stx %g3, [rp-8]
retl
addc %g0, %g0, %o0
EPILOGUE()
Isn't it the case that this won't generate the correct
overflow condition? We need the inverse of the overflow
bit this addxccc generates.
More information about the gmp-devel
mailing list