[PATCH] T3/T4 sparc shifts, plus more timings
davem at davemloft.net
Sun Mar 31 23:49:23 CEST 2013
From: Torbjorn Granlund <tg at gmplib.org>
Date: Sun, 31 Mar 2013 23:33:36 +0200
> I think they messed up "predicted taken" and "predicted non-taken" at
> the gate level. So for enough iterations, the predictor
> considers--correctly--that the branch will be taken. And then the
> misinterpres it.
> The loop branch back is fast only when it is predicted non-taken.
Such an error would show up as a larger latency.
The final branch result is determined in the E stage of the Ultra-III,
which signals mispredict to the A stage, which is 7 cycles earlier in
This matches the listed 8 cycle latency of a mispredicted branch in
the chip's instruction latency tables.
So a simple re-ocurring mispredict after a certain number of
iterations doesn't appear to be what we're seeing here.
More information about the gmp-devel