I've ported GMP to Mac Pro. GMPbench > 7700
jason.worth.martin at gmail.com
Sun Oct 15 14:52:01 CEST 2006
After everything is in cache and the limb count is high enough, I'm
getting 3 clock cycles/limb on Woodcrest and 3.5 clock cycles/limb on
Conroe. Note, however, that to test my code out on my Linux Conroe
box, I had to replace the lahf and sahf instructions with bt and setc
which seem to be a little slower (at least Agner Fog says so). I've
attached my testing code and timing routines so you can see exactly
what I'm doing.
On 14 Oct 2006 16:18:48 +0200, Torbjorn Granlund <tege at swox.com> wrote:
> "Jason Worth Martin" <martinjw at jmu.edu> writes:
> I've ported GMP to the Mac Pro (and it passes "make check"). I
> thought you might be interested since I saw some archived posts on
> the poor Core 2 results. I re-wrote add_n.asm and sub_n.asm and got a
> nice speed up by unrolling the loop and getting rid of the "inc"
> instruction. I've attached a tarball with the relevant files and
> I've made some experiments too, using the forgotten instruction jrcxz:
> jrcxz exitloop
> lea -1(%rcx), %rcx
> jmp beginloop
> I believe that this code could be ported to Core 2 Linux machines, but
> the GNU assembler doesn't want to let me use the "sahf" and "lahf"
> instructions to save the flags between loop iterations. The Apple
> assembler doesn't have a problem with it. I'll see about replacing
> those instructions with a combitation of "bt" and "test" for a Linux
> What sort of performance, as measured by tune/speed do you get for
> your new functions?
"Ever my heart rises as we draw near the mountains.
There is good rock here." -- Gimli, son of Gloin
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 3920 bytes
Desc: not available
Url : http://gmplib.org/list-archives/gmp-devel/attachments/20061015/37d55b37/attachment.bin
More information about the gmp-devel