thresholds
David Newman
david.newman at jesus.ox.ac.uk
Wed Apr 28 16:30:38 CEST 2004
I did some "make tune" tests on an Athlon-xp 2600 (Barton core) recently
and found the results differ slightly from those in k7/gmp-mparam.h.
Below is a summary; each entry represents, in turn:
- default given by k7/gmp-mparam.h
- gcc 3.3.3 with "-O2 -march=athlon-xp -fomit-frame-pointer"
- gcc 3.4 with "-O2 -march=athlon-xp -fomit-frame-pointer"
- gcc 3.4 with "-O3 -march=athlon-xp -fomit-frame-pointer"
- gcc 3.4 with the quite aggressive "-O3 -march=athlon-xp -funroll-loops
-fomit-frame-pointer -fprefetch-loop-arrays -ffast-math -fforce-addr
-falign-functions=4 -maccumulate-outgoing-args -frerun-cse-after-loop
-ftracer"
The tests seem to show that having a more recent version of gcc and/or
more aggressive CFLAGS doesn't necessarily mean you get better
thresholds. However, some results differ considerably from the athlon
defaults, GCD_EXT_THRESHOLD being the most obvious.
The difference is not really too surprising though, as the file
k7/gmp-mparam.h has to cover (I think) at least four different types of
chip, from the early 800Mhz models to the most recent Bartons with 512k
of L2 cache.
I don't imagine the different thresholds affect the speed of GMP greatly
but it would seem sensible to have them chosen optimally. How much work
would it take to change the compile process to do a "make tune" at
compile time, if this is feasible?
David Newman
MUL_TOOM3_THRESHOLD 202 / 174 / 177 / 173 / 177
SQR_TOOM3_THRESHOLD 226 / 185 / 186 / 182 / 183
DIV_DC_THRESHOLD 92 / 84 / 88 / 84 / 85
POWM_THRESHOLD 142 / 128 / 142 / 134 / 128
GCDEXT_THRESHOLD 46 / 26 / 30 / 28 / 14
MUL_FFT_TABLE { 816, 1696, 3456, 7680, 22528, 0 }
{ 784, 1696, 3456, 7680, 22528, 57344, 0 }
{ 752, 1696, 3200, 7680, 18432, 57344, 0 }
{ 752, 1696, 3200, 7680, 18432, 57344, 0 }
{ 784, 1696, 3200, 8704, 18432, 57344, 0 }
MUL_FFT_MODF_THRESHOLD 832 / 800 / 768 / 768 / 776
MUL_FFT_THRESHOLD 8448 / 9472 / 8448 / 7936 / 8960
SQR_FFT_TABLE { 784, 1760, 3456, 7680, 18432, 40960, 0 }
{ 752, 1760, 3200, 7680, 18432, 57344, 0 }
{ 752, 1632, 3456, 7680, 18432, 57344, 0 }
{ 752, 1696, 3200, 7680, 18432, 57344, 0 }
{ 784, 1568, 3456, 8704, 18432, 57344, 0 }
SQR_FFT_MODF_THRESHOLD 800 / 768 / 768 / 768 / 800
SQR_FFT_THRESHOLD 8448 / 8448 / 7936 / 7936 / 7936
More information about the gmp-devel
mailing list