GMP «Arithmetic without limitations» GMPbench results


GMPbench results. We have run the benchmark on the highest-frequency CPU of each type to which we have convenient access. Scaling to lower or higher frequencies should work well, since GMP mainly works off the caches.

The GMPbench suite is available for download. To run the benchmarks, you also need to compile gexpr.c and put it somewhere in your path.

GMP exercises a particular set of processor capabilities, widening integer multiplication being the most important one. Processors with poor integer multiply support get worse scores on GMPbench than on other benchmarks.

GMPbench measures multiplication of same size and different size operands, squaring, division, gcd, and extended gcd, and then RSA encryption and π computation. That's it. GMP typically performs a whole lot more operations which are not measured at all.

Let us make it absolutely clear that GMP exercises just integer instructions, and that GMPbench thus is an integer benchmark. We often see claims to the contrary, i.e. that it is a floating-point benchmark.

GMPbench is a single-threaded benchmark. Most CPUs below can execute multiple threads in parallel using some separate and some shared hardware resources. Most operations in GMP up to quite large operand sizes would scale very well over multi-core systems, but poorly with SMT. A hypothetical multi-threaded GMPbench would therefore give about 8 times the current Ryzen 2700X score but only about 4 times the current Xeon E3-1270v5 score.

The results below have been measured with software workarounds for the various hardware bugs (Meltdown, Spectre, L1TF, MDS, etc) at various stages of completeness. BIOS were at the newest available version, but the latest vendor provided microcode was not provided unless provided by the BIOS. We don't have an estimate of the slowdown GMP will suffer when workarounds are final.

Preliminary measurements for Intel Skylake (and thus Kaby Lake, Coffee Lake, Whiskey Lake, Amber Lake, Fool Ake, etc which are all just Sky Lake reiterations) indicate significant slowdown. We see no slowdown for AMD CPUs. There might be newer Intel CPUs which have been bug fixed, but last time we checked (in mid 2020) the bugs were discovered faster than they were fixed. Performance results which are inflated by corners being cut resulting in a security disaster are not of great interest. We have therefore crossed over Intel results.

GMP repo [measured at different times, therefore unfair] GMPbench 0.2 results


Table Footnotes
T "Turbo" enabled
1 Affected by Spectre vulnerability
2 Affected by L1TF vulnerability
3 Affected by Meltdown vulnerability
4 Affected by MDS vulnerability


CPU
freq
MHz
A
B
I
Compiler/Compilation flags
base
multiply
base
divide
base
gcd
base
gcdext
app
rsa
app
pi
GMP-
bench
Score/
GHz
Date
measured
Intel i5 12600K (Alder Lake) 3700-4900 64
"gcc 11.2.0" -O2 -march=alderlake
142728T 108294T 18868T 12630T 17328T 112T 9294T 2022‑03‑17
AMD Ryzen 5800X (Vermeer 32MB L3) 3800-4700 64
"gcc 9.3.0" -O2 -march=znver2
109914T 94254T 18942T 11704T 14137T 102T 8018T 2021‑01‑20
Intel Core i5 11600K (Rocket Lake) 3900-4900 64
"gcc 10.2.0" -O2 -march=skylake -mtune=skylake
109443T 92170T 16545T 10920T 15155T 97.9T 7910T 2021‑05‑02
AMD Ryzen 3700X (Matisse 32MB L3) 3600-4400 64
"gcc 8.3.0" -O2 -march=znver1
92032T 78266T 15296T 9574T 12671T 82.3T 6728T 2020‑01‑31
Apple M1 3200 64
cc -O2
90549 71427 15463 10243 9760 92.8 6422 2021‑01‑21
AMD Ryzen 2700X (Pinnacle Ridge 16MB L3) 3700-4300 64
"gcc 8.3.0" -O2 -march=znver1
78989T,1 72411T,1 13725T,1 9021T,1 10557T,1 69.3T,1 5843T,1 2020‑01‑31
Intel Xeon E3‑1270v5 (Skylake 8MB L3) 3600-4000 64
"gcc 8.3.0" -O2 -march=broadwell -mtune=skylake
73637T,1,2,3,4 68354T,1,2,3,4 12418T,1,2,3,4 8417T,1,2,3,4 9824T,1,2,3,4 68.5T,1,2,3,4 5524T,1,2,3,4 2019‑06‑12
AMD Ryzen 1500X (Summit Ridge 16MB L3) 3500-3900 64
"gcc 8.3.0" -O2 -march=znver1
70432T,1 64495T,1 12319T,1 8135T,1 9482T,1 62.0T,1 5232T,1 2021‑01‑21
Intel Xeon E3‑1285Lv4 (Broadwell 6MB L3) 3400-3800 64
"gcc 6.4.0" -O2 -march=broadwell
68302T,1,2,3,4 64734T,1,2,3,4 11944T,1,2,3,4 7944T,1,2,3,4 9162T,1,2,3,4 65T,1,2,3,4 5201T,1,2,3,4 2018‑04‑29
Intel Xeon E3‑1271v3 (Haswell 8MB L3) 3600-4000 64
"gcc 6.4.0" -O2 -march=haswell
67606T,1,2,3,4 60910T,1,2,3,4 11860T,1,2,3,4 7879T,1,2,3,4 8351T,1,2,3,4 62T,1,2,3,4 4967T,1,2,3,4 2018‑04‑29
IBM POWER9 3800 64
"gcc 4.8.5" -O2 -mtune=power8
641321,3 434491,3 76861,3 57721,3 68991,3 54.91,3 40371,3 2018‑11‑29
Intel Xeon E5‑1650v2 (Ivy Bridge 12MB L3) 3500 64
"gcc 4.5.3" -O2 -march=core2
520361,2,3,4 498851,2,3,4 91081,2,3,4 59461,2,3,4 67561,2,3,4 50.71,2,3,4 39561,2,3,4 2015‑04‑30
Intel Xeon E3‑1270 (Sandy Bridge 8MB L3) 3400-3800 64
"gcc 6.4.0" -O2 -march=sandybridge
50871T,1,2,3,4 49322T,1,2,3,4 9295T,1,2,3,4 6286T,1,2,3,4 6678T,1,2,3,4 50.1T,1,2,3,4 3935T,1,2,3,4 2018‑04‑29
AMD Phenom 1090T (K10 6MB L3) 3200-3600 64
"gcc 8.3.0" -O2 -march=amdfam10
47268T,1 46261T,1 7770T,1 5254T,1 6633T,1 47.8T,1 3683T,1 2019‑06‑10
AMD FX 8350 (Piledriver 8MB L3) 4000-4200 64
"gcc 8.3.0" -O2 -march=bdver2
39088T,1 39187T,1 7509T,1 4927T,1 5123T,1 41.2T,1 3108T,1 2019‑06‑10
AMD A12-9800 (Excavator 0MB L3) 3800-4200 64
"gcc 6.4.0" -O2 -march=bdver4
38516T,1 38604T,1 7631T,1 5062T,1 4845T,1 39.7T,1 3034T,1 2018‑08‑08
AMD FX 4100 (Bulldozer 8MB L3) 3600-3800 64
"gcc 6.4.0" -O2 -march=bdver1
33277T,1 33507T,1 6077T,1 3973T,1 4390T,1 35.4T,1 2636T,1 2018‑04‑29
Intel Xeon X3470 (Lynnfield 8 MB L3) 2933-3600 64
"gcc 6.4.0" -O2 -march=corei7
32046T,1,2,3,4 30480T,1,2,3,4 5862T,1,2,3,4 3944T,1,2,3,4 4138T,1,2,3,4 33.1T,1,2,3,4 2491T,1,2,3,4 2018‑04‑29
IBM POWER7 / SMT-4 3550 64
"gcc 4.8.3" -O3 -mtune=power7
339111,3 306471,3 50961,3 37021,3 39161,3 35.31,3 24791,3 6981,3 2016‑05‑22
IBM POWER8 / SMT-4 3425 64
"gcc 4.9.2" -O3 -mtune=power8
295171,3 283231,3 47881,3 37011,3 35401,3 34.71,3 23101,3 7101,3 2016‑05‑22
Intel Xeon E3110 (Penryn 6 MB L2) 3000 64
"gcc 4.9.2" -O2 -march=core2
27965T,1,2,3,4 25338T,1,2,3,4 5147T,1,2,3,4 3255T,1,2,3,4 3496T,1,2,3,4 30.0T,1,2,3,4 2149T,1,2,3,4 2018‑05‑14
Intel J4105 (Goldmont Plus) 1500-2400 64
"gcc 8.2.0" -O2 -march=slm
26486T,1,3 25450T,1,3 4918T,1,3 3605T,1,3 3660T,1,3 28.4T,1,3 2136T,1,3 2019‑04‑03
Intel Atom C3758 (Goldmont) 2200 64
"gcc 6.4.0" -O2 -march=slm
200181 200351 35391 24321 29701 21.91 16421 2018‑04‑29
Arm X-Gene 1 2400 64
"gcc 4.8.4" -O2
172081 188791 35131 23371 22971 19.31 14341 5981 2017‑03‑14
AMD 5350 (Jaguar) 2050 64
"gcc 4.9.2" -O2 -march=btver2
153351 173561 29461 20231 21001 17.21 12831 6261 2016‑06‑06
Arm Cortex-A57 2000 64
"gcc 4.8.5" -O2 -fomit-frame-pointer
145201 161471 30441 18881 18501 16.91 12091 2017‑02‑19
Arm Cortex-A72 1800 64
"gcc 4.7.3" -O2 -mtune=cortex-a72
128311 147011 32451 19441 16851 15.01 11131 2018‑12‑09
Arm Cortex-A73 1800 64
"gcc 7.4.0" -O2 -mtune=cortex-a72
120081 139901 31291 18981 15491 14.61 10571 2019‑07‑15
Intel Atom C2758 (Silvermont) 2400 64
"gcc 6.4.0" -O2 -march=slm
120971,3,4 146241,3,4 23971,3,4 15491,3,4 16591,3,4 14.21,3,4 10351,3,4 2018‑05‑14
AMD E-350 (Bobcat) 1600 64
"gcc 4.8.4" -O2 -march=btver1
108771 130041 21391 15031 16031 12.71 9511 5941 2016‑06‑06
Intel Pentium 4 (Nocona) 3400 64
"gcc" -O2 -mtune=nocona
116841,2,3,4 122921,2,3,4 22511,2,3,4 15081,2,3,4 13821,2,3,4 12.81,2,3,4 9241,2,3,4
IBM POWER6 3500 64
"xlc" -O2 -qarch=pwr6
105611 114011 21101 12921 11331 13.01 8411 2401
Arm Cortex-A15 2000 32
"gcc 5.2.1" -O2 -fomit-frame-pointer
101211 100781 19831 13961 11621 12.21 8121 4061 2016‑03‑21
IBM z10 4400 64
"gcc 4.8.4" -O2 -march=z10
7268 9476 1701 980 818 9.09 620 141 2015‑04‑30
Arm Cortex-A53 1500 64
"gcc 5.3" -O2 -fomit-frame-pointer
59691 85111 16581 9801 7721 7.831 5581 3731 2016‑03‑28
Athlon32 2083 32
"gcc 4.5.3" -O2 -fomit-frame-pointer
6088 6934 1446 950 644 8.37 519 249 2015‑05‑18
Arm Cortex-A9 1400 32
"gcc 5.4.9" -O2 -fomit-frame-pointer
51431 52541 10911 7381 6591 6.351 4331 2019‑07‑15
Intel Atom 330 1600 64
"gcc 4.4.1" -O2
4592 5223 986 589 500 5.66 374 234


GMP 5.0.1 GMPbench 0.2 results


CPU
freq
MHz
A
B
I
Compiler/Compilation flags
base
multiply
base
divide
base
gcd
base
gcdext
app
rsa
app
pi
GMP-
bench
Score/
GHz
Opteron/Athlon64 K10 6MB L3 3200 64
"gcc 4.3.3" -O2 -m64 -mtune=k8
40879 37617 6189 3918 5226 39.4 2985 933
Core i5 2500 (Sandy Bridge) 3300 64
"gcc 4.4.5" -O2 -m64
38560 37308 6357 4173 5173 38.1 2943 891
Core i7 920 (Nehalem) 2667 64
"gcc 4.2.1" -O2 -m64
25737 23708 4416 2904 3284 26.8 1962 735
Core 2 E6400 (Conroe) 2133 64
"gcc 4.2.1" -O2 -m64
19209 17351 3385 2097 2422 20.7 1466 687
PowerPC 970 ("G5") 2700 64
"gcc 4.0.1-5370" -O3 -m64 -mtune=970
12601 14658 2579 1464 1391 15 1013 375
Itanium 2 1300 64
"gcc 4.2.4" -O2 -m64
16290 11175 2336 1384 1207 16.1 980 735
VIA Nano 1600 64
"gcc 4.3.4" -O2 -m64
12202 10588 2379 1420 1703 12.4 950 594
Pentium 4 (Nocona) 3400 64
"gcc 4.2.1" -O2 -m64
11457 11914 2148 1357 1447 12.8 915 269
AMD Bobcat 1600 64
"gcc 4.2.1" -O2
11115 11533 2143 1331 1526 11.7 895 559
Pentium 4 (Northwood) 2600 32
"gcc 4.2.1" -O2 -fomit-frame-pointer -march=pentium4
5538 5280 1225 740 666 7.5 462 178
Intel Atom 330 1600 64
"gcc 4.4.1" -O2 -m64
4422 5174 896 502 498 5.6 362 226
UltraSPARC 3 1593 64
"gcc 3.4.4" -O2 -m64
4125 4168 926 547 392 5.7 330 207
Alpha 21264 500 64
"gcc 3.3.3" -O2 -mcpu=ev6
3492 3592 666 445 449 4.2 286 572


GMP 4.3.x GMPbench 0.2 results


CPU
freq
MHz
A
B
I
Compiler/Compilation flags
base
multiply
base
divide
base
gcd
base
gcdext
app
rsa
app
pi
GMP-
bench
Score/
GHz
Opteron/Athlon64 K10 6MB L3 3200 64
"gcc 4.3.3" -O2 -m64 -mtune=k8
39826 24748 5953 3729 4956 37 2669 834
Core i7 920 (Nehalem) 2667 64
"gcc 4.2.1" -O2 -m64
23842 14217 4375 2782 3083 23 1683 631
Core 2 E6400 (Conroe) 2133 64
"gcc 4.2.1" -O2 -m64
17981 10656 3312 2035 2330 18 1276 598
PowerPC 970 ("G5") 2700 64
"gcc 4.0.1-5367" -O3 -m64 -mcpu=970
12364 9349 2412 1489 1577 14 946 350
Itanium 2 1300 64
"gcc 4.2.1" -O2 -m64
15578 7983 2260 1342 1294 14.6 909 699
Pentium 4 (Nocona) 3200 64
"gcc 4.2.1" -O2 -m64
10975 6968 1938 1315 1468 11.3 799 250
Pentium 4 (Northwood) 2600 32
"gcc 4.2.1" -O2 -fomit-frame-pointer -march=pentium4
5162 2874 1163 707 654 6.5 394 152
UltraSPARC 3 1593 64
"gcc 3.4.4" -O2 -m64
3733 2488 956 533 370 5.1 286 180
Alpha 21264 500 64
"gcc 3.3.3" -O2 -mcpu=ev6
3208 2277 654 446 447 3.8 256 512
Pentium MMX 233 32
"gcc 3.4.4" -O2 -fomit-frame-pointer
264 175 71 42 27 0.38 21 90


GMP 4.3.x GMPbench 0.1 results


CPU
freq
MHz
A
B
I
Compiler/Compilation flags
base
multiply
base
divide
app
rsa
GMP-
bench
Score/
GHz
Opteron/Athlon64 K10 2300 64
"gcc 3.4.3" -O2 -m64 -mtune=k8
81633 42278 3606 14554 6328
Opteron/Athlon64 K8/K9 2200 64
"gcc 3.4.6" -O2 -m64 -mtune=k8
69279 40081 3232 13050 5932
Core 2 E6400 (65nm) 2133 64
"gcc 4.2.1" -O2 -m64 -mtune=k8
51519 24316 2314 9050 4249
Pentium 4 (Nocona) 3200 64
"gcc 3.4.4" -O3 -m64 -mtune=k8
31259 16412 1427 5685 1777
PowerPC 970 ("G5") 1600 64
"gcc 4.0.1 build 5367" -mcpu=970 -O3
22119 12198 916 3880 2425
Alpha 21264   64
 
         
Athlon XP   32
 
         
Pentium 4 Prescott   32
 
         
Pentium 4 (Northwood) 2600 32
"gcc 3.4.6" -O2 -fomit-frame-pointer -march=pentium4
16133 6726 680 2661 1023
Pentium 3 / Pentium M   32
 
         
Atom 1600 64
"gcc 4.2.1" -O3 -m64 -mtune=k8
12471 6940 457 2063 1289
UltraSPARC 3 1593 64
 
11066 5942 370 1732  
PowerPC 7447 ("G4")   32
 
         
Alpha 21164A   64
 
         

Notes:
  1. The clock frequencies for the above measures are not the same as for GMP 4.2, since we didn't have access to the same hardware. However, we have remeasured some of the 4.2 numbers and updated the table below.
  2. The last column, "Optimal", is an estimate of what could be attained by writing optimised assembly code for this processor.

GMP 4.2.x GMPbench 0.1 results


CPU
freq
MHz
A
B
I
Compiler/Compilation flags
base
multiply
base
divide
app
rsa
GMP-
bench
Score/
GHz
Opteron/Athlon64 K10 2300 64
"gcc 3.4.3" -O2 -m64 -mtune=k8
43473 23880 2178 8377 3642
Opteron/Athlon64 K8/K9 2200 64
"gcc 3.4.6" -O2 -m64 -mtune=k8
38362 21621 1979 7549 3431
Core 2 E6400 (65nm) 2133 64
"gcc 4.2.1" -O2 -m64 -mtune=k8
36902 20330 2092 7570 2523
PowerPC 970 ("G5") 2700 64
"gcc 4.0.1 build 5367" -mcpu=970 -fast
27740 16500 1409 5490 2033
Pentium 4 3200 64
"gcc 3.4.4" -O2 -m64 -mtune=k8
19425 10525 929 3645 1139
Alpha 21264 1000 64
"gcc 4.1.2" -O3 -mcpu=ev67
18703 11272 913 3641 3641
Itanium 2 1600 64
"gcc 4.1.1" -O3 -mtune=itanium2
19744 10340 799 3379 2112
Athlon XP 2083 32
"gcc 4.0.2" -O2 -fomit-frame-pointer
15682 7902 624 2636 1265
Pentium 4 Prescott 3000 32
"gcc 4.0.2" -O2 -fomit-frame-pointer -march=pentium4
15123 6189 675 2556  
Pentium 4 (Northwood) 2600 32
"gcc 3.4.6" -O2 -fomit-frame-pointer -march=pentium4
14111 5468 569 2236  
Pentium 3 / Pentium M 1862 32
"gcc 3.4.4" -O2 -fomit-frame-pointer
11381 5286 429 1824  
UltraSPARC 3 1593 64
"gcc 3.4.4" -O2 -mcpu=ultrasparc
10597 5349 368 1665  
HPPA 8800 800 64
"cc B.11.X.32509-32512.GP" +DD64 +O2
9466 3631 385 1503  
Atom 1600 64
"gcc 4.2.1" -O2 -m64 -mtune=k8
6737 4465 320 1325 828
PowerPC 7447 ("G4") 1420 32
"gcc 4.1.0" -O2 -mpowerpc -mcpu=7450
6080 3479 247 1066  
Alpha 21164A 600 64
"gcc 4.1.2" -O3 -mcpu=ev56
3964 2122 179 721  

GMP 4.1.x results


CPU
freq
MHz
A
B
I
Compiler/Compilation flags
base
multiply
base
divide
app
rsa
GMP-
bench
Score/
GHz
Opteron/Athlon64 2400 64
"gcc 3.4.2" -O2 -mcpu=nocona -funroll-loops
(NB! no asm code)
27321 18280 1441 5675  
PowerPC 970 ("G5") 2500 64
"gcc 3.4" -O3
20324 12874 1110 4238  
Opteron/Athlon64 2400 32
"gcc 3.3.3" -O2 -fomit-frame-pointer
(NB! 32-bit only)
19127 9823 802 3316  
Alpha 21264 1000 64
"gcc 2.9-gnupro-99r1" -O2
16813 10706 782 3240  
Pentium 4 3200 64
"gcc 4.0.2" -O2 -m64 -mtune=k8
(NB! No asm code)
15613 9186 814 3122  
Itanium 2 1600 64
"gcc 3.4.3" -O2
(NB! Low-quality asm code)
17046 9027 749 3047  
Athlon XP 2083 32
"gcc 3.3.2" -O2 -fomit-frame-pointer
14076 7731 616 2535  
Pentium 4 (Northwood) 2800 32
"gcc 3.3.2" -O2 -fomit-frame-pointer -march=pentium4
13013 5770 586 2253  
Pentium 4 Prescott 3000 32
"gcc 3.3.2" -O2 -fomit-frame-pointer -march=pentium4
13348 5393 574 2206  
POWER 4 1100 64
"gcc 3.2.1" -O2 -maix64 -mpowerpc64 -mtune=power3
8951 5920 478 1863  
Pentium 3 / Pentium M 1862 32
"gcc 3.4.4" -O2 -fomit-frame-pointer
8125 4712 393 1560  
HPPA 8800 800 64
"cc B.11.11.30766" +DD64 +O2
9040 3724 362 1450  
UltraSPARC 3 1336 64
"gcc 3.4.4" -O2 -m64 -mptr64 -mcpu=v9
6111 3645 265 1119  
MIPS R14000 500 64
cc 7.3.0
5284 2819 241 964  
PowerPC 74x7 ("G4") 1000 32
"gcc 3.3.3" -O2 -mpowerpc
3453 2203 165 676  
POWER 3 475 64
"gcc 2.9-aix51-020209" -maix64 -mpowerpc64 -O2
3647 2259 157 671  
Alpha 21164A 600 64
"gcc 3.2.1" -O2
3514 2185 158 663  
VIA C3 Nehemiah 1000 32
"gcc 3.4.2" -O2 -fomit-frame-pointer -march=c3-2
2378 1314 111 442  
UltraSPARC 2i 400 64
"gcc 3.2.2" -O2 -mcpu=ultrasparc
1971 900 89 343  

Notes: