GMP logo GMPbench results


GMPbench results. We have run the benchmark on the highest-frequency machine of each type to which we have convenient access. Scaling to lower or higher frequencies should work well, since GMP mainly work off the caches.

The GMPbench suite is available for download here. To run the benchmarks, you also need to compile gexpr.c and put it somewhere in your path.

GMP exercises a particular set of processor capabilities, widening integer multiplication being the most important one. Processors with poor integer multiply support get worse scores on GMPbench than on other benchmarks.

GMPbench 0.2 isn't a perfect benchmarks suite for GMP, but it is much better than GMPbench 0.1.

GMPbench 0.1 measures multiplication with same-size operands, and division, and then RSA encryption. That's it. GMP typically performs a whole lot more operations which are not measured at all. Furthermore, the RSA figures weigh too much in the score computation.

GMPbench 0.2 adds squaring, multiplication of different-size operands, more variants of division, gcd and gcdext, all in the "base" hierarchy, and pi computation in the "app" hierarchy.

The added benchmarks of GMPbench 0.2 aren't totally fair to older GMP versions; we added measurements of things we thought were important, and then used these to drive generic improvements of GMP 4.3. We sort of tuned-for-the-benchmark, but since we chose the benchmarks to make sense, it isn't as silly as such an exercise would usually be.

GMP 4.3.x GMPbench 0.2 results


CPU
freq
MHz
  Compiler/Compilation flags
base
multiply divide gcd gcdext
app
rsa pi
GMP
bench
Score/
GHz
Opteron/Athlon64 K10 6MB L3 3200 64 "gcc 4.3.3" -O2 -m64 -mtune=k8 39826 24748 5953 3729 4956 37 2669 834
Core 2 E6400 (65nm) 2133 64 "gcc 4.2.1" -O2 -m64 17981 10656 3312 2035 2330 18 1276 598
Itanium 2 1300 64 "gcc 4.2.1" -O2 -m64 15578 7983 2260 1342 1294 14.6 909 699
PowerPC 970 (G5) 2700 64 "gcc 4.0.1-5367" -O3 -m64 -mcpu=970 12364 9349 2412 1489 1577 14 946 350
Pentium 4 3200 64 "gcc 4.2.1" -O2 -m64 10975 6968 1938 1315 1468 11.3 799 250
Pentium 4 Northwood 2600 32 "gcc 4.2.1" -O2 -fomit-frame-pointer -march=pentium4 5162 2874 1163 707 654 6.5 394 152
UltraSPARC 3 1593 64 "gcc 3.4.4" -O2 -m64 3733 2488 956 533 370 5.1 286 180

GMP 4.3.x GMPbench 0.1 results


CPU
freq
MHz
  Compiler/Compilation flags
base
multiply divide
app
rsa
GMPbench Score/
GHz
Optimal
(see note)
Opteron/Athlon64 K10 2300 64 "gcc 3.4.3" -O2 -m64 -mtune=k8 81633 42278 3606 14554 6328 26000 @ 3.2GHz
Opteron/Athlon64 K8/K9 2200 64 "gcc 3.4.6" -O2 -m64 -mtune=k8 69279 40081 3232 13050 5932 24000 @ 3.2GHz
Core 2 E6400 (65nm) 2133 64 "gcc 4.2.1" -O2 -m64 -mtune=k8 51519 24316 2314 9050 4249 16000 @ 3.33GHz
Pentium 4 3200 64 "gcc 3.4.4" -O3 -m64 -mtune=k8 31259 16412 1427 5685 1777 7000 @ 3.8GHz
PowerPC 970 (G5) 1600 64 "gcc 4.0.1 build 5367" -mcpu=970 -O3 22119 12198 916 3880 2425  
Alpha 21264   64              
Athlon XP   32              
Pentium 4 Prescott   32              
Pentium 4 Northwood 2600 32 "gcc 3.4.6" -O2 -fomit-frame-pointer -march=pentium4 16133 6726 680 2661 1023  
Pentium 3 / Pentium M   32              
Atom 1600 64 "gcc 4.2.1" -O3 -m64 -mtune=k8 12471 6940 457 2063 1289  
UltraSPARC 3 1593 64   11066 5942 370 1732    
PowerPC 7447 (G4)   32              
Alpha 21164A   64              

Notes:
  1. These results are preliminary and based on a snapshot of what will become GMP 4.3. Final results should be somewhat better for certain processors.
  2. The clock frequencies for the above measures are not the same as for GMP 4.2, since we didn't have access to the same hardware. However, we have remeasured some of the 4.2 numbers and updated the table below.
  3. The last column, "Optimal", is an estimate of what could be attained by writing optimized assembly code for this processor.

GMP 4.2.x GMPbench 0.1 results


CPU
freq
MHz
  Compiler/Compilation flags
base
multiply divide
app
rsa
GMPbench Score/
GHz
Optimal
(see note)
Opteron/Athlon64 K10 2300 64 "gcc 3.4.3" -O2 -m64 -mtune=k8 43473 23880 2178 8377 3642  
Opteron/Athlon64 K8/K9 2200 64 "gcc 3.4.6" -O2 -m64 -mtune=k8 38362 21621 1979 7549 3431 20000 @ 3.2GHz
Core 2 E6400 (65nm) 2133 64 "gcc 4.2.1" -O2 -m64 -mtune=k8 36902 20330 2092 7570 2523 12000 @ 3.33GHz
PowerPC 970 (G5) 2700 64 "gcc 4.0.1 build 5367" -mcpu=970 -fast 27740 16500 1409 5490 2033 7500 @ 2.7GHz
Pentium 4 3200 64 "gcc 3.4.4" -O2 -m64 -mtune=k8 19425 10525 929 3645 1139 5000 @ 3.8GHz
Alpha 21264 1000 64 "gcc 4.1.2" -O3 -mcpu=ev67 18703 11272 913 3641 3641 6000 @ 1.25GHz
Itanium 2 1600 64 "gcc 4.1.1" -O3 -mtune=itanium2 19744 10340 799 3379 2112 13000 @ 1.6GHz
Athlon XP 2083 32 "gcc 4.0.2" -O2 -fomit-frame-pointer 15682 7902 624 2636 1265  
Pentium 4 Prescott 3000 32 "gcc 4.0.2" -O2 -fomit-frame-pointer -march=pentium4 15123 6189 675 2556   4000 @ 3.8GHz
Pentium 4 Northwood 2600 32 "gcc 3.4.6" -O2 -fomit-frame-pointer -march=pentium4 14111 5468 569 2236   3500 @ 3.4GHz
Pentium 3 / Pentium M 1862 32 "gcc 3.4.4" -O2 -fomit-frame-pointer 11381 5286 429 1824    
UltraSPARC 3 1593 64 "gcc 3.4.4" -O2 -mcpu=ultrasparc 10597 5349 368 1665    
HPPA 8800 800 64 "cc B.11.X.32509-32512.GP" +DD64 +O2 9466 3631 385 1503    
Atom 1600 64 "gcc 4.2.1" -O2 -m64 -mtune=k8 6737 4465 320 1325 828  
PowerPC 7447 (G4) 1420 32 "gcc 4.1.0" -O2 -mpowerpc -mcpu=7450 6080 3479 247 1066    
Alpha 21164A 600 64 "gcc 4.1.2" -O3 -mcpu=ev56 3964 2122 179 721    


GMP 4.1.x results


CPU
freq
MHz
  Compiler/Compilation flags
base
multiply divide
app
rsa
GMPbench
Optimal
(see note)
Opteron/Athlon64 2400 64 "gcc 3.4.2" -O2 -mcpu=nocona -funroll-loops
(NB! no asm code)
27321 18280 1441 5675  
PowerPC 970 (G5) 2500 64 "gcc 3.4" -O3 20324 12874 1110 4238  
Opteron/Athlon64 2400 32 "gcc 3.3.3" -O2 -fomit-frame-pointer
(NB! 32-bit only)
19127 9823 802 3316  
Alpha 21264 1000 64 "gcc 2.9-gnupro-99r1" -O2 16813 10706 782 3240  
Pentium 4 3200 64 "gcc 4.0.2" -O2 -m64 -mtune=k8
(NB! No asm code)
15613 9186 814 3122  
Itanium 2 1600 64 "gcc 3.4.3" -O2
(NB! Low-quality asm code)
17046 9027 749 3047  
Athlon XP 2083 32 "gcc 3.3.2" -O2 -fomit-frame-pointer 14076 7731 616 2535  
Pentium 4 Northwood 2800 32 "gcc 3.3.2" -O2 -fomit-frame-pointer -march=pentium4 13013 5770 586 2253  
Pentium 4 Prescott 3000 32 "gcc 3.3.2" -O2 -fomit-frame-pointer -march=pentium4 13348 5393 574 2206  
POWER 4 1100 64 "gcc 3.2.1" -O2 -maix64 -mpowerpc64 -mtune=power3 8951 5920 478 1863  
Pentium 3 / Pentium M 1862 32 "gcc 3.4.4" -O2 -fomit-frame-pointer 8125 4712 393 1560  
HPPA 8800 800 64 "cc B.11.11.30766" +DD64 +O2 9040 3724 362 1450  
UltraSPARC 3 1336 64 "gcc 3.4.4" -O2 -m64 -mptr64 -mcpu=v9 6111 3645 265 1119  
MIPS R14000 500 64 cc 7.3.0 5284 2819 241 964  
PowerPC 74x7 (G4) 1000 32 "gcc 3.3.3" -O2 -mpowerpc 3453 2203 165 676  
POWER 3 475 64 "gcc 2.9-aix51-020209" -maix64 -mpowerpc64 -O2 3647 2259 157 671  
Alpha 21164A 600 64 "gcc 3.2.1" -O2 3514 2185 158 663  
VIA C3 Nehemia 1000 32 "gcc 3.4.2" -O2 -fomit-frame-pointer -march=c3-2 2378 1314 111 442  
UltraSPARC 2i 400 64 "gcc 3.2.2" -O2 -mcpu=ultrasparc 1971 900 89 343  

Notes:
Please send comments about this page to gmp-discuss@gmplib.org
Copyright 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 Free Software Foundation
Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved.