[PATCH] T3/T4 sparc shifts, plus more timings

David Miller davem at davemloft.net
Sun Mar 31 19:36:35 CEST 2013

From: Torbjorn Granlund <tg at gmplib.org>
Date: Sun, 31 Mar 2013 05:03:10 +0200

> The lshiftc code runs at 3 c/l on US3, not the claimed 2.5 c/l.  I
> suspect also the US1 claim if 2 c/l is invalid.

So I've solved the puzzle of why we get 3 c/l for lshiftc on US3.

If the loop executes one time, it runs in 5 cycles.

But if it executes more than one time, each iteration after the
first executes in 6 cycles.

The reason is that the final cycle:

	srlx	u0, tcnt, %l4
	bge,pt	%xcc, L(top)
	 stx	r1, [n + r1_off]	C WAS: rp - 16

schedules in the first instruction of the loop into that final
instruction group:

	not	%l3, %l3

breaking our carefully scheduled code.

I'm going to play around with some things to try and fix this.

Interestingly, UltraSPARC-1 and UltraSPARC-2 would not group the
final cycle of the loop this way, because of it's requirement that
integer operations must occur in the first three instructions of
a group.

More information about the gmp-devel mailing list