[PATCH] T3/T4 sparc shifts, plus more timings
davem at davemloft.net
Sun Mar 31 19:36:35 CEST 2013
From: Torbjorn Granlund <tg at gmplib.org>
Date: Sun, 31 Mar 2013 05:03:10 +0200
> The lshiftc code runs at 3 c/l on US3, not the claimed 2.5 c/l. I
> suspect also the US1 claim if 2 c/l is invalid.
So I've solved the puzzle of why we get 3 c/l for lshiftc on US3.
If the loop executes one time, it runs in 5 cycles.
But if it executes more than one time, each iteration after the
first executes in 6 cycles.
The reason is that the final cycle:
srlx u0, tcnt, %l4
bge,pt %xcc, L(top)
stx r1, [n + r1_off] C WAS: rp - 16
schedules in the first instruction of the loop into that final
not %l3, %l3
breaking our carefully scheduled code.
I'm going to play around with some things to try and fix this.
Interestingly, UltraSPARC-1 and UltraSPARC-2 would not group the
final cycle of the loop this way, because of it's requirement that
integer operations must occur in the first three instructions of
More information about the gmp-devel