rshift for 64bit Core2

Torbjorn Granlund tg at
Sat Mar 22 15:00:31 CET 2008

nisse at (Niels Möller) writes:

  Peter Cordes <peter at> writes:
  > More unrolling makes the intro loop even worse: it can
  > run up to 7 times, instead of 3.
  An alternative way of organizing the intro is like
    if (n & 1)
        /* one iteration */
        n --;
    if (n & 2)
        /* two iterations */
        n -= 2;
    if (n & 4)
        /* four iterations */
        n -= 4;
    /* Now n is a multiple of 8. */
    for (; n > 0; n -= 8)
        /* Main loop, 8 iterations at a time */
  Not sure how Torbjörn usually does things.
  But I guess the above might be worse than a plain intro loop due to
  bad branch predictability...

I think the opposite, actually.  With a loop, branch prediction for
the loop branch will need to say "taken" say 5 times, and then the 6th
time "not taken", and on next invocation with the same n, it needs to
be predicted taken immediately again.  This requires a complex branch
predictor that can match a pattern of taken and non-taken (taken 5
times, non-taken, taken 5 times, non-taken ...).

With your variant above, one may get 100% good predictions with any
branch predictor (assuming consecutive invocation use the same n, of


More information about the gmp-devel mailing list