thresholds

David Newman david.newman at jesus.ox.ac.uk
Wed Apr 28 16:30:38 CEST 2004


I did some "make tune" tests on an Athlon-xp 2600 (Barton core) recently 
and found the results differ slightly from those in k7/gmp-mparam.h.

Below is a summary; each entry represents, in turn:

- default given by k7/gmp-mparam.h
- gcc 3.3.3 with "-O2 -march=athlon-xp -fomit-frame-pointer"
- gcc 3.4 with "-O2 -march=athlon-xp -fomit-frame-pointer"
- gcc 3.4 with "-O3 -march=athlon-xp -fomit-frame-pointer"
- gcc 3.4 with the quite aggressive "-O3 -march=athlon-xp -funroll-loops 
-fomit-frame-pointer -fprefetch-loop-arrays -ffast-math -fforce-addr 
-falign-functions=4 -maccumulate-outgoing-args -frerun-cse-after-loop 
-ftracer"

The tests seem to show that having a more recent version of gcc and/or 
more aggressive CFLAGS doesn't necessarily mean you get better 
thresholds. However, some results differ considerably from the athlon 
defaults, GCD_EXT_THRESHOLD being the most obvious.

The difference is not really too surprising though, as the file 
k7/gmp-mparam.h has to cover (I think) at least four different types of 
chip, from the early 800Mhz models to the most recent Bartons with 512k 
of L2 cache.

I don't imagine the different thresholds affect the speed of GMP greatly 
but it would seem sensible to have them chosen optimally. How much work 
would it take to change the compile process to do a "make tune" at 
compile time, if this is feasible?

David Newman



MUL_TOOM3_THRESHOLD     202 / 174 / 177 / 173 / 177
SQR_TOOM3_THRESHOLD     226 / 185 / 186 / 182 / 183
 

DIV_DC_THRESHOLD        92 / 84 / 88 / 84 / 85
POWM_THRESHOLD          142 / 128 / 142 / 134 / 128
 

GCDEXT_THRESHOLD        46 / 26 / 30 / 28 / 14
 

MUL_FFT_TABLE           { 816, 1696, 3456, 7680, 22528, 0 }
                         { 784, 1696, 3456, 7680, 22528, 57344, 0 }
                         { 752, 1696, 3200, 7680, 18432, 57344, 0 }
                         { 752, 1696, 3200, 7680, 18432, 57344, 0 }
                         { 784, 1696, 3200, 8704, 18432, 57344, 0 }
MUL_FFT_MODF_THRESHOLD  832 / 800 / 768 / 768 / 776
MUL_FFT_THRESHOLD       8448 / 9472 / 8448 / 7936 / 8960
 

SQR_FFT_TABLE           { 784, 1760, 3456, 7680, 18432, 40960, 0 }
                         { 752, 1760, 3200, 7680, 18432, 57344, 0 }
                         { 752, 1632, 3456, 7680, 18432, 57344, 0 }
                         { 752, 1696, 3200, 7680, 18432, 57344, 0 }
                         { 784, 1568, 3456, 8704, 18432, 57344, 0 }
SQR_FFT_MODF_THRESHOLD  800 / 768 / 768 / 768 / 800
SQR_FFT_THRESHOLD       8448 / 8448 / 7936 / 7936 / 7936


More information about the gmp-devel mailing list