GMP on Pentium 2
deltatrinity at hotmail.com
Sat Nov 8 17:49:52 CET 2003
Humm, nice work!
I remember a book I baught a few years ago, about code optimization. I
think that this is just the kind of 'practical demonstration' that proove
the importance of checking every single details of the processor in tight
I wonder though if there are other places like that that could be optimized
in the code by aligning to quad words. Who knows, this may give us a 'free'
speed-boost (without modifying the code, just the alignment).
>From: Torbjorn Granlund <tg at swox.com>
>To: Patrick Pelissier <Patrick.Pelissier at loria.fr>,gmp-discuss at swox.com
>Subject: Re: GMP on Pentium 2
>Date: 08 Nov 2003 03:26:56 +0100
>I found the reason for the 3.7 vs 3.2 cycles/limb performance for
>mpn/x86/aors_n.asm on p6. Alignment. If the loop start is at an
>address 8 mod 16, the loop needs 3.7 cycles/limb, but if it is
>aligned 0 mod 16, it needs only 3.2 cycles/limb. Since the code forces
>just 0 mod 8 alignment, both timing results happen depending on
>where the code end up being put by the linker.
>gmp-discuss mailing list
>gmp-discuss at swox.com
MSN Messenger with backgrounds, emoticons and more.
More information about the gmp-discuss