GMP on Pentium 2

Sat Nov 8 17:49:52 CET 2003

Humm, nice work!

I remember a book I baught a few years ago, about code optimization.  I 
think that this is just the kind of 'practical demonstration' that proove 
the importance of checking every single details of the processor in tight 
loops.

I wonder though if there are other places like that that could be optimized 
in the code by aligning to quad words.  Who knows, this may give us a 'free' 
speed-boost (without modifying the code, just the alignment).

Eric.

>From: Torbjorn Granlund <tg at swox.com>
>To: Patrick Pelissier <Patrick.Pelissier at loria.fr>,gmp-discuss at swox.com
>Subject: Re: GMP on Pentium 2
>Date: 08 Nov 2003 03:26:56 +0100
>
>I found the reason for the 3.7 vs 3.2 cycles/limb performance for
>mpn/x86/aors_n.asm on p6.  Alignment.  If the loop start is at an
>address 8 mod 16, the loop needs 3.7 cycles/limb, but if it is
>aligned 0 mod 16, it needs only 3.2 cycles/limb.  Since the code forces
>just 0 mod 8 alignment, both timing results happen depending on
>where the code end up being put by the linker.
>
>--
>Torbjörn
>_______________________________________________
>gmp-discuss mailing list
>gmp-discuss at swox.com
>https://gmplib.org/mailman/listinfo/gmp-discuss

_________________________________________________________________
MSN Messenger with backgrounds, emoticons and more. 
http://www.msnmessenger-download.com/tracking/cdp_customize