[PATCH] T3/T4 sparc shifts, plus more timings
davem at davemloft.net
Fri Mar 29 04:35:42 CET 2013
From: Torbjorn Granlund <tg at gmplib.org>
Date: Fri, 29 Mar 2013 04:28:22 +0100
> I noticed nand in "vis". But it lookes like it operates on fp
> registers. And there might not be any useful shift insn in vis. (We'd
> want generic code anyway, of course.)
Yes these are all VIS instructions and operate on the FPU registers,
so perform terribly on the niagara chips.
VIS3 (T3 and later) have vectored shift instructions.
But all of the VIS stuff only gives us up to 32-bit vectored
operations (f.e. 2 X 32-bit).
Many moons ago when I was working on an iDCT in VIS, I would load
carefully choosen constants into float registers and use the VIS
vectored multiply instructions to perform shifts.
> We don't have prefetch in many asm GMP routines yet.
I'd say that we should just let intelligent cpus take care of it
transparently, they can see the access pattern and many of them
I'm pretty sure that current generation x86 cpus do stuff like
On the Sparc side, most don't. The only case I know of hw
auto-prefetching was ultrasparc-3 and ultrasparc-4, which did
auto-prefetch but only for FPU loads.
More information about the gmp-devel