GMP assembly chart
|
This is a chart with performance numbers in cycles/limb for many mpn (i.e., low-level) functions of GMP. A straight number without any special annotations means that the mpn function of that line is implemented for the CPU of that column either in the official repository or in a local repository of a maintainer. For annotated numbers, please see the table above.
To compare these numbers fairly, 32-bit machines should only be compared to
32-bit machines, and 64-bit machines should only be compared to 64-bit
machines. A 64-bit machine performs twice the amount of work for many
functions, but 4 times the work for multiply primitives, compared to 32-bit
machines.
AMD K7 32 |
Intel North 32 |
Intel Presc 32 |
Intel Copp 32 |
Intel Doth 32 |
Intel Atom 32 |
AMD K8 64 |
AMD K10 64 |
AMD BD1 64 |
AMD BD2 64 |
AMD BD4 64 |
AMD ZN1 64 |
AMD ZN2 64 |
AMD ZN3 64 |
AMD ZN4 64 |
AMD BT1 64 |
AMD BT2 64 |
Intel Nocona 64 |
Intel PNR 64 |
Intel NHM 64 |
Intel SBR 64 |
Intel IBR 64 |
Intel HWL 64 |
Intel BWL 64 |
Intel SKL 64 |
Intel RKL 64 |
Intel ALD 64 |
Intel Atom 64 |
Intel SLM 64 |
Intel GLM 64 |
Intel GLM+ 64 |
||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
add_n |
1.64{1.5} |
4 | 4.25 | 3.19[2.75] |
2.14 | 3 | 1.5 | 1.5 | 1.5 | 1.5 | 1.6 | 1.5 | 1.11 | 1 | 1 | 2.28 | 2 | 4 | 2 | 2 | 1.55 | 1.5 | 1.21 | 1.04 | 1.21 | 1.18 | 1 | 3 | 3 | 2.17 | 2.1 | add_n |
sub_n |
1.64{1.5} |
4 | 4.25 | 3.19[2.75] |
2.14 | 3 | 1.5 | 1.5 | 1.5 | 1.5 | 1.6 | 1.5 | 1.11 | 1 | 1 | 2.28 | 2 | 4 | 2 | 2 | 1.55 | 1.5 | 1.21 | 1.04 | 1.21 | 1.18 | 1 | 3 | 3 | 2.17 | 2.1 | sub_n |
addlsh1_n |
2.5 | 4.25 | 5 | 6 | 2 | 2{1.69} |
2.3 | 2.2 | 2 | 1.54 | 1.54 | 1.35 | 1.35 | 2.875 | 2.88[2.67] |
5.8 | 3.1 | 2.75 | 1.95 | 1.89{1.67} |
1.8{1.64} |
1.53 | 1.51 | 1.75 | 1.62 | 4.875 | 4 | 3.19 | 2.9 | addlsh1_n |
||
sublsh1_n |
2.87 | 6.667 | 2.18 | 2.18{2} |
2.3 | 2.2 | 2 | 1.58 | 1.58 | 1.35 | 1.44 | 3.25 | 3.25 | 5.8 | 3 | 3.1{2.5} |
2.47{2.17} |
2.36{2} |
2.11 | 1.67 | 1.65 | 1.68 | 1.57 | 5 | 4.8 | 3.32 | 2.7 | sublsh1_n |
||||
rsblsh1_n |
6 | 2 | 2{1.69} |
2.3 | 2.2 | 2 | 1.54 | 1.54 | 1.35 | 1.35 | 2.875 | 2.88[2.67] |
3.1 | 2.75 | 1.95 | 1.89{1.67} |
1.8{1.64} |
1.53 | 1.51 | 1.75 | 1.62 | 4.875 | 4 | 3.19 | 2.9 | rsblsh1_n |
||||||
addlsh2_n |
6 | 2.1 | 2 | 2.7 | 2.6 | 2.25 | a2 | a2 | a1.6 | 1.55 | 3.3 | 3 | 5.8 | 3.1 | 2.75 | 2 | 1.89 | 1.8 | 1.52 | 1.51 | 1.75 | 1.57 | 5.75 | 4 | 3.19 | 2.9 | addlsh2_n |
|||||
sublsh2_n |
7 | 5.8 | 3 | 3.1 | 2.47 | 2.36 | 2.11 | 1.67 | 1.65 | 1.5 | sublsh2_n |
|||||||||||||||||||||
rsblsh2_n |
6 | 2.1 | 2 | 2.7 | 2.6 | 2.25 | a2 | a2 | a1.6 | 1.55 | 3.3 | 3 | 3.1 | 2.75 | 2 | 1.89 | 1.8 | 1.52 | 1.51 | 1.75 | 1.57 | 5.75 | 4 | 3.19 | 2.9 | rsblsh2_n |
||||||
addlsh_n |
a2.87 | a2.75 | 4.2 | 3.7 | 2.33 | 1.69 | 1.58 | 1.35 | 1.35 | 5.46{4.3} |
5-6.6 [4.67] |
3 | 2.8 | 2.75 | 2.71 | 2.07 | a1.78 | a1.78 | 1.68 | 1.59 | 7.75{6} |
7.17 | 4 | 3.5 | addlsh_n |
|||||||
sublsh_n |
{2.5-3.25} |
{2.5-3.25} |
{2.75} |
{2.75} |
{3} |
sublsh_n |
||||||||||||||||||||||||||
rsblsh_n |
a2.87 | a2.75 | 4.2 | 3.7 | 2.33 | 1.69 | 1.58 | 1.35 | 1.35 | 5.46{4.3} |
5-6.6 [4.67] |
3 | 2.8 | 2.75 | 2.78 | 2.07 | a1.78 | a1.78 | 1.68 | 1.59 | 7.75{6} |
7.17 | 4 | 3.5 | rsblsh_n |
|||||||
lshsub_n |
lshsub_n |
|||||||||||||||||||||||||||||||
add_n_sub_n |
[2.5] |
[2.5] |
add_n_sub_n |
|||||||||||||||||||||||||||||
rsh1add_n |
4.5 | 5.25 | 2 | 2{1.67} |
2.75 | 2.6 | 2.21 | 1.91 | 1.95 | 1.71 | 1.6 | 3.25{2.7} |
3 | 5.63 | 3.1{2.7} |
3.3{2.77} |
2.05 | 2.03 | 2.04 | 1.6 | 1.53 | 1.52 | 1.5 | 5.25 | 5.3 | 4.3 | 3 | rsh1add_n |
||||
rsh1sub_n |
2 | 2{1.67} |
2.75 | 2.6 | 2.21 | 1.91 | 1.95 | 1.71 | 1.6 | 3.25{2.7} |
3 | 5.63 | 3.1{2.7} |
3.3{2.77} |
2.05 | 2.03 | 2.04 | 1.6 | 1.53 | 1.52 | 1.5 | 5.25 | 5.3 | 4.3 | 3 | rsh1sub_n |
||||||
cnd_add_n |
3.4 | 5 | 5.25 | 5.77{4.0} |
4.67 | 4.58 | 2[1.9] |
2[1.9] |
2.32 | 2.2 | 1.95 | 1.67 | 1.67 | 1.45 | 1.45 | 3 | 3 | 13 | 2.9 | 2.8 | 1.93 | 1.89 | 1.78 | 1.5 | 1.5 | 1.5 | 1.5 | 4.58 | 4.1{3.25} |
3 | 2.7 | cnd_add_n |
cnd_sub_n |
3.4 | 5 | 5.25 | 5.77{4.0} |
4.67 | 5.59 | 2 | 2 | 2.32 | 2.2 | 1.95 | 1.67 | 1.67 | 1.45 | 1.45 | 3 | 3 | 13 | 2.9 | 2.8 | 2.15 | 1.96 | 2 | 1.65 | 1.65 | 1.53 | 1.5 | 5 | 4.6[4.35] |
3 | 2.7 | cnd_sub_n |
mul_1 |
3.25 | 4 | 4.5 | 5.6{4.38} |
4.16{3.75} |
7.5 | 2.5 | 2.5 | 4.5 | 4.3 | 4.4 | 2 | 1.68[1.57] |
1.5 | 1.44 | 5 | 5.2-5.8 | 13.1 | 4 | 3.75 | 2.5 | 2.3{2} |
1.57 | 1.51 | 1.52 | 1.12 | 1 | 17.3 | 7.9 | 3.82 | 3.5 | mul_1 |
mul_1c |
Y | Y | Y | N | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | N | N | N | N | Y | Y | Y | Y | mul_1c |
||
addmul_1 |
3.75 | 5{4} |
5 | 6.43{5.63} |
5.21{4.75} |
8 | 2.5 | 2.5 | 4.6-4.75 | 4.6 | {4.95} |
2 | 2.1[1.79] |
1.5 | 1.5 | 5 | 5.2 | 16 | 4.4-4.9 | 4.55 | 3.24 | 2.96{2.7} |
2.31 | 1.65 | 1.62 | 1.62 | 1.29 | 19.37 | 7.7 | 4 | 3.9 | addmul_1 |
addmul_1c |
N | N | N | N | N | N | N | N | N | N | Y | N | N | N | N | N | N | N | N | Y | N | N | addmul_1c |
|||||||||
submul_1 |
3.75 | 6 | 6.5 | 6.43{5.63} |
5.5 |
8 | 2.5 | 2.5 | 4.6-4.75 | 4.6 | {4.95} |
2 | 2.1 | 2.2[1.97] |
2.2 | 5 | 5.2 | 16 | 4.4-4.9 | 4.55 | 3.24 | 2.96{2.7} |
2.31 | 2.06 | 1.95 | 2 | 1.52 | 19.37 | 7.7 | 4 | 3.9 | submul_1 |
mul_2 |
(4) | (4) | 2.25 | 2.25 | 4.36 | 4.15 | {4.65} |
13.3 | 4 | 3.83 | 2.57 | 2.29{2.0} |
1.86 | 17.75 | 3.65 | 3.3 | mul_2 |
|||||||||||||||
mul_3 |
mul_3 |
|||||||||||||||||||||||||||||||
mul_4 |
mul_4 |
|||||||||||||||||||||||||||||||
addmul_2 |
(4) | (4) | 2.375 | 2.375 | 4.2 | 4.3 | 4.4 | 14.3 | {4} |
{4.06} |
2.93 | 2.6{2.35} |
2.15 | 18.8 | 3.85 | 3.6 | addmul_2 |
|||||||||||||||
addmul_3 |
addmul_3 |
|||||||||||||||||||||||||||||||
addmul_4 |
addmul_4 |
|||||||||||||||||||||||||||||||
mul_basecase |
3.9[3.75] |
4.6¹ | 5¹ | 6.5 | 5.3¹ | 8.9¹ | 2.5¹ | 2.5¹ | 4.79¹ | ? | 2.13¹ | 1.75¹ | 5.25¹ | 5.36¹ | 14.7¹ | 4.28¹ | 4.24¹ | 3.1¹ | 2.8¹ | 2.31¹ | 1.94¹ | 1.79¹ | Y | Y | N | 10.1¹ | 3.86¹ | 3.61¹ | mul_basecase |
|||
mullo_basecase |
Y | Y | Y | ? | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | N | Y | Y | Y | mullo_basecase |
|||||||||
mulmid_basecase |
Y | Y | Y | Y | Y | mulmid_basecase |
||||||||||||||||||||||||||
mulhi_basecase |
mulhi_basecase |
|||||||||||||||||||||||||||||||
sqr_basecase |
3.9[3.75] |
5.3² | 5.6² | 8.56 | 6.0² | 9.7² | 3.0² |
3.0² |
5.24² |
? | 2.17² | 1.92² | 5.65² | 5.76² | 15.5² | 4.81² | 4.54² | 3.32² | 3.05² | 2.42² | 2.06² | 1.86² | Y | Y | N | 11.5² | 4.31² | 4.31² | sqr_basecase |
|||
sqrlo_basecase |
sqrlo_basecase |
|||||||||||||||||||||||||||||||
sqr_diag_addlsh1 |
2.5 | 2.5 | 3.6 | ? | 2.43 | 2.25 | 4 | 3.8 | 4 | 3.6 | 3.13 | 3.1 | 2.5 | 2.27 | 2.1 | 14 | 6 | 3.7 | sqr_diag_addlsh1 |
|||||||||||||
sbpi1_bdiv_r |
2 | 2.1 | ? | sbpi1_bdiv_r |
||||||||||||||||||||||||||||
redc_1 |
2.5 | 2.5 | 4.87 | ? | 5 | ? | 4.25 | 4.5 | 3.24 | 3.04 | 2.31 | 19.37 | ? | Y | Y | redc_1 |
||||||||||||||||
redc_2 |
{2.375} |
{2.375} |
redc_2 |
|||||||||||||||||||||||||||||
lshift |
1.2 | 1.75 | 2 | 1.8 | 1.75{1.46} |
5 | 2.35 | 1.8{1.3} |
1.3 | 1.2 | 1.09 | 1.5{1.41} |
1.5{0.75} |
0.6{0.375} |
0.62 | 3.16 | 1.65 | 3 | 1.32 | 1.3 | 1.3 | 1.3 | 1.17{0.62} |
1.15[0.78] |
1.15[0.46] |
0.91 | 0.86 | 4.5 | 3 | 2 | 1.5 | lshift |
rshift |
1.2 | 1.75 | 2 | 1.8 | 1.75{1.46} |
5 | 2.35 | 1.8{1.3} |
1.3 | 1.2 | 1.04 | 1.5{1.41} |
1.5{0.75} |
0.6{0.375} |
0.62 | 3.16 | 1.65 | 3.33 | 1.32 | 1.3\2.15 | 1.3 | 1.3 | 1.17{0.62} |
1.15[0.78] |
1.15[0.46] |
0.91 | 0.85 | 4.5 | 3 | 2 | 1.5 | rshift |
lshiftc |
6 | 5.5 | 2.75 | 2{1.5} |
1.4 | 1.3 | 1.09 | 1.5 | 1.5 | 0.65 | 0.7 | 3.7 | 1.9 | 3.5 | 1.52 | 1.78 | 1.45 | 1.42 | 1.3 | 1.3 | 1.27 | 1.05 | 1.02 | 5 | 3.5 | 2.3 | 1.75 | lshiftc |
||||
copyd |
0.75-1 | 2 |
2 |
0.85 | 0.73{0.5} |
1.75{0.5} |
1 | 1[0.85] |
0.7 | 0.7 | 0.53 | 0.5 | 0.5[0.25] |
0.5[0.25] |
0.48 | 1.48 | 0.64 | 2.8 | 0.52-0.8 | 0.48 | 0.52 | 0.51 | 0.5[0.25] |
0.51[0.26] |
0.5[0.25] |
0.47 | 0.28 | 1.16-1.66 | 1 | 0.8[0.75] |
0.52 | copyd |
copyi |
0.75-1 | 1 |
1.5 |
0.85 | 0.73{0.5} |
1.75{0.5} |
1 | 1[0.85] |
0.7 | 0.7 | 0.53 | 0.5 | 0.5[0.25] |
0.5[0.25] |
0.48 | 1.48 | 0.64 | 2.8 | 0.52-0.64 | 0.48 | 0.51-0.54 | 0.51 | 0.5[0.25] |
0.51[0.26] |
0.5[0.25] |
0.47 | 0.25 | 1.16-1.61 | 1 | 0.8[0.75] |
0.52 | copyi |
sec_tabselect |
1.33 | 2.1-2.63 | 1.7-2.57 | 1.75-2.25 | 1.33-1.87 | 1.85-2.7 | 1.5 | 0.78-0.85 | 0.8-1.25 | ? | 1.1 | 1 | 0.8 | 2.15 | 1.25 | 2.5-2.95 | 1.17-1.25 | 0.87-0.9 | 0.63-0.8 | 0.63-0.8 | 0.63-0.8 | 0.64 | 0.65 | 2.5 | 2.5 | 1.5 | sec_tabselect |
|||||
com |
1 | 1.25 | 1.18[0.85] |
0.9 | 0.8 | 0.63 | 0.5 | 0.5[0.25] |
0.5[0.25] |
0.48 | 1.75 | 0.91 | 2.8 | 0.64-0.87 | 0.51-0.62 | 0.51-0.65 | 0.50-0.64 | 0.51-0.58[0.3] |
0.52-0.64[0.3] |
0.51-0.63[0.3] |
0.47 | 0.35 | 2.75 | 1 | 1.06[1] |
0.75 | com |
|||||
and_n |
{1.5} |
3 | 1.5 | 1.5\2 | 1.65 | 1.5 | 1.55 | 1.5 | 1.5[1.2] |
1 | 1.08 | 2.67 | 2 | 2.75 | 2 | 2 | 1.5 | 1.5 | 1.11 | 1.09 | 1.21 | 1.18 | 0.77 | 3.75 | 3 | 2.3 | 2 | and_n |
||||
ior_n |
{1.5} |
3 | 1.5 | 1.5\2 | 1.65 | 1.5 | 1.55 | 1.5 | 1.5[1.2] |
1 | 1.08 | 2.67 | 2 | 2.75 | 2 | 2 | 1.5 | 1.5 | 1.11 | 1.09 | 1.21 | 1.18 | 0.77 | 3.75 | 3 | 2.3 | 2 | ior_n |
||||
xor_n |
{1.5} |
3 | 1.5 | 1.5\2 | 1.65 | 1.5 | 1.55 | 1.5 | 1.5[1.2] |
1 | 1.08 | 2.67 | 2 | 2.75 | 2 | 2 | 1.5 | 1.5 | 1.11 | 1.09 | 1.21 | 1.18 | 0.77 | 3.75 | 3 | 2.3 | 2 | xor_n |
||||
andn_n |
{1.75} |
3.5 | 1.5\2.5 | 1.5\2 | 1.9 | 1.9 | ? | 1.5 | 1.5[1.25] |
1.35[1.1] |
1.35 | 2.5-2.75 | 2.25 | 3.35 | 2 | 2 | 1.5 | 1.5 | 1.35 | 1.3 | 1.27 | 1.18 | 0.86 | 3.75 | 3 | 2.3 | 2 | andn_n |
||||
iorn_n |
{1.75} |
3.5 | 1.5\2.5 | 1.5\2 | 1.9 | 1.9 | 1.71 | 1.5 | 1.5[1.25] |
1.35[1.1] |
1.35 | 2.5-2.75 | 2.25 | 3.35 | 2 | 2 | 1.5 | 1.5 | 1.35 | 1.3 | 1.27 | 1.18 | 0.86 | 3.75 | 3 | 2.3 | 2 | iorn_n |
||||
xnor_n |
{1.75} |
3.5 | 1.5\2.5 | 1.5\2 | 1.9 | 1.9 | 1.71 | 1.5 | 1.5[1.25] |
1.35[1.1] |
1.35 | 2.5 | 2.25 | 3.35 | 2 | 2 | 1.5 | 1.5 | 1.35 | 1.3 | 1.27 | 1.18 | 0.86 | 3.75 | 3 | 2.3 | 2 | xnor_n |
||||
nand_n |
{1.75} |
3.5 | 1.5\1.75 | 1.5\2 | 2 | 1.9 | 1.71 | 1.5 | 1.5[1.25] |
1.2[1.1] |
1.2 | 2.5 | 2.25 | 3.6 | 2 | 2 | 1.5 | 1.5 | 1.35 | 1.3 | 1.27 | 1.18 | 0.86 | 3.75 | 3 | 2.35 | 2 | nand_n |
||||
nior_n |
{1.75} |
3.5 | 1.5\1.75 | 1.5\2 | 2 | 1.9 | 1.71 | 1.5 | 1.5[1.25] |
1.2[1.1] |
1.2 | 2.5 | 2.25 | 3.6 | 2 | 2 | 1.5 | 1.5 | 1.35 | 1.3 | 1.27 | 1.18 | 0.86 | 3.75 | 3 | 2.35 | 2 | nior_n |
||||
divrem_1int |
17[14] |
32 | 34 | 25 | 24[19] |
38[25-28] |
13 | 13 | 20-20.7 | 18-19 | ? | 12[11.5] |
12[11.5] |
10.5-11.5 | 10.4-11.1 | 17-18 | 17-18 | 44 | 24 | 19 | 14.2-14.4 | 13.7-14 | 13.7 | 11.9-12.25 | 12-12.25 | 12-12.25 | 11.4 | 46 | 22-25 | 19.5-20 | divrem_1int |
|
divrem_1frc |
15[13] |
30 | 32 | 17.3 | 17[15] |
23[22] |
12 | 12 | 18 | ? | 10.4 | ? | ? | 16 | 16 | 42 | 19 | 18 | 12.4 | 11.9 | 11.8 | 10.5 | 10.7 | 36 | 16.6 | 18-19 | divrem_1frc |
|||||
pre_divrem_1 |
Y | Y | Y | Y | Y | Y | Y | ? | ? | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | pre_divrem_1 |
||||||||||
div_qr_1u_pi1 |
div_qr_1u_pi1 |
|||||||||||||||||||||||||||||||
div_qr_1n_pi1 |
11 | 11 | 16 | ? | 16 | 15 | ? | 19.3 | 19 | 15 |
13.2 | 12 | ? | ? | 52 |
26 | div_qr_1n_pi1 |
|||||||||||||||
div_qr_1u_pi2 |
{9} |
{9} |
{13} |
? | {14} |
{?} |
{34} |
{13.5} |
{11.5} |
{9.5} |
div_qr_1u_pi2 |
|||||||||||||||||||||
div_qr_1n_pi2 |
{7.5} |
{7.5} |
{11} |
? | {13} |
{?} |
{31} |
{12.5} |
{10.5} |
{7.5} |
div_qr_1n_pi2 |
|||||||||||||||||||||
divrem_2 |
22 | 63 | 70 | 30 | 29 | 44 | 18 | 18 | 26.8 | 25 | 25 | 15.4 | 16 | 14.5 | 14.3 | 27 | 27 | 68 | 34 | 30.25 | 21.3 | 21.4 | 20.6 | 17.4 | 17.2 | 16.4 | 16.3 | 73 | 33 | 25.5 | 24 | divrem_2 |
div_qr_2n_pi2 |
{13.5} |
{13.5} |
{20} |
? | {22} |
{?} |
{47} |
{23} |
{18} |
{13.5} |
? | ? | div_qr_2n_pi2 |
|||||||||||||||||||
divexact_1 |
11 | 19 | 21 | 10.2-12 | 11 | 16-20 | 10 | 10 | 14 | 13 | 14-15 | 8.07 | 8 | 8 | 7.6 | 15 | 16 | 33 | 13.25 | 14 | 8.5 | 8.54 | 8 | 8.28 | 8.15 | 8 | 8 | 36 | 14 |
12.6 | 12 | divexact_1 |
bdiv_q_1_pi1 |
11 | 19 | 21 | 10.6-12 | 11 | 16-20 | 10 | 10 | 14 | 13 | 14-15 | 8 | 8 | 7.6 | 15[14] |
16 [14.5-15] |
33 | 13.25 | 14 | 8.5 | 8.54 | 8 | 8.28 | 8.15 | 8 | 8 | 36 | 14 |
12.6 | 12 | bdiv_q_1_pi1 |
|
bdiv_qr_1_pi2 |
[8] |
[8] |
{12} |
? | {12.4} |
{?} |
[24.7] |
[13.4] |
[12.7] |
[7] |
[7] |
[6.5] |
[6] |
[6] |
bdiv_qr_1_pi2 |
|||||||||||||||||
mode1o |
11 | 19 | 21 | 10.2 | 11 | 15 | 10 | 10 | 14 | 13 | 13.3 | 8 | 8 | 8 | 7.4 | 15 | 14 | 33 | 13 | 14.25 | 8.2 | 8.2 | 8 | 8 | 8 | 8 | 8 | 35 | 11 | 12 | 11.6 | mode1o |
diveby3 |
diveby3 |
|||||||||||||||||||||||||||||||
bdiv_dbm1c |
3.5 | 8.25 | 11.65 | 5.33 | 5 | 8 | 2.25 | 2.25 | 4.6 | 4.5 | 4.25 | 2.56 | 2.7 | 2.5 | 2.5 | 6.22 | 6.22 | 12.5 | 4 | 3.75 | 3.6 | 3.6 | 3.57 | 2.47 | 2.5 | 2.5 | 2.3 | 20 | 8 | 3.76 | 3.3 | bdiv_dbm1c |
mod_1_1p |
7 | 16 | 18 | 14.2 | 10 | 17 | 6 | 6 | 10{8.25} |
8 | 8 | 5.9 | 5.9 | 5.5 | 5.4 | 9 | 8 | 26 | 12.5{10.5} |
11{10.5} |
8.4[8] |
7.4{7} |
7.25 | 5.65 | 5.5 | 6 | 5.5 | 26 | 11.6 | 8.8 | 8.2 | mod_1_1p |
mod_1s_2p |
4 | 4 | 7{6.3} |
6.5 | 6.3 | 3 | 3 | 3.2 | 3 | 8.61 | 9.3 | 19 | 8 | 6.5{6} |
4.5{4} |
3.9{3.27} |
3.55{3.33} |
3.06 | 3{2.5} |
3.3 | 2.64 | 11.6 | 5 | 4.8 | mod_1s_2p |
|||||||
mod_1s_3p |
{3} |
{3} |
{5.5} |
? | 2.7 | 2 | ? | ? | {8} |
{?} |
{16} |
{5.41} |
{4.5} |
{3} |
{2.16} |
mod_1s_3p |
||||||||||||||||
mod_1s_4p |
4.75{4.25} |
4 | 4.5 | 7.1 | 3.4 | 8.75 | 3{2.75} |
3{2.75} |
5.7{5} |
5.3 | 5.5 | 2.55 | 2.55 | 2.4 | 2.4 | 7.67 | 7 | 15.75 | 5 | 4[3.75] |
3.25{2.5} |
3.05{2.37} |
2.6{2.125} |
2.25{1.93} |
2.15{1.87} |
2.2 | 1.5 | 23 | 10 | 4.2 | 3.5 | mod_1s_4p |
mod_34lsub1 |
1 |
1.25 | 1.25 | 2.04 | 1.9 |
2.33 | 0.67 | 0.67 | 1 | 0.88 | 0.75 | 0.64 | 0.59 | 0.5 | 0.46 | 1.125 | 1 | 3.2 | 1.25 | 1.15 | 0.93 | 0.92 | 0.82[0.29] |
0.64[0.29] |
0.6[0.28] |
0.67 | 0.69 | 2.45 | 1.59 | 1.25 | 1.25 | mod_34lsub1 |
gcd_11 |
5.31/b | [10/b] |
[10/b] |
4.57 | 5.09/b | [8.9/b] |
5.2/b | 4.3/b | 5.4/b | 3.5/b | 3.97/b | 3.37/b | 3.28/b | 3.0/b | 1.8/b | 5.4/b | 3.93/b | 13.5/b | 4.63/b | 5.54/b | 4.76/b | 4.32/b | 3.99/b | 3.84/b | 3.86/b | 3.9/b | 3.7/b | 8.77/b | 6.9/b | 5.9/b | 5.8/b | gcd_11 |
gcd_22 |
8.9/b | 7.4/b | 9.7/b | 6.7/b | 6.7/b | 5.4/b | 5.5/b | 4.6/b | 4.25/b | 9.2/b | 8.9/b | 21.8/b | 8.7/b | 9.1/b | 9.1/b | 7.9/b | 7.1/b | 5.5/b | 5.6/b | 6.1/b | 6.0/b | 18.9/b | 14/b | 9.8/b | 8.8/b | gcd_22 |
||||||
invert_limb |
41 | 48 | 48 | 63 | ? | 41 | 41 | ? | 64 | 64 | 135 | 69 | 55 | 44 | 42 | 42 | 41 | 41 | 130 | 58 | 50 | invert_limb |
||||||||||
popcount |
5(4) | 3.9 | 4.25 | 6.45 | 4.6 |
5.5 | 6 | 1.125 | 1.24 | 1.2 | 1.22 | 0.72 | 0.72 | 0.62 | 0.62 | 6.1 | 1.15 | 8 | 2.61 | 1.04 | 1.02 | 1 | 1.0[0.68] |
1.0[0.68] |
1.0[0.63] |
1 | 1 | 10.75 | 1.34 | 1.3 | 1 | popcount |
hamdist |
6(5) | {5.4} |
{5.4} |
6.67 | 6.08 | 8 | 7 | 2{1.5} |
1.51 | 1.5 | 1.5 | 1.15 | 1.15 | 1 | 1 | 7.5 | 2.5 | 14.3{10} |
3.28 | 2.03 | 1.66 | 1.62 | 1.5 | 1.5 | 1.5 | 1.3 | 1 | 17.5(12) | 2.55 | 2.37 | 1.9 | hamdist |
AMD K7 32 |
Intel North 32 |
Intel Presc 32 |
Intel Copp 32 |
Intel Doth 32 |
Intel Atom 32 |
AMD K8 64 |
AMD K10 64 |
AMD BD1 64 |
AMD BD2 64 |
AMD BD4 64 |
AMD ZN1 64 |
AMD ZN2 64 |
AMD ZN3 64 |
AMD ZN4 64 |
AMD BT1 64 |
AMD BT2 64 |
Intel Nocona 64 |
Intel PNR 64 |
Intel NHM 64 |
Intel SBR 64 |
Intel IBR 64 |
Intel HWL 64 |
Intel BWL 64 |
Intel SKL 64 |
Intel RKL 64 |
Intel ALD 64 |
Intel Atom 64 |
Intel SLM 64 |
Intel GLM 64 |
Intel GLM+ 64 |
PPC 74x7 32 |
PPC 970 64 |
IBM PWR5 64 |
IBM PWR6 64 |
IBM PWR7 64 |
IBM PWR8 64 |
IBM PWR9 64 |
IBM PWR10 64 |
Sun US3 64 |
Sun T1 64 |
Sun T4 64 |
Alpha 21264 64 |
Itanium 2 64 |
ARM a5 neon 32 |
ARM a7 neon 32 |
ARM a8 neon 32 |
ARM a9 neon 32 |
ARM a15 32 |
ARM a15 neon 32 |
ARM a17 neon 32 |
ARM a53 64 |
ARM a55 64 |
ARM a57 64 |
ARM a72 64 |
ARM a73 64 |
ARM a76 64 |
ARM X‑Gene 64 |
ARM Apple M1 64 |
Zarch IBM z15 64 |
||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
add_n |
4 | 2 | 2.25 | 2.63 | 2.18 | 2.14 | 2.13 | 1.25 | 4.5 | 17 | 3 | 2.125 | 1.25 | 4.33 | 2.75-3.5 | 3.25 | 2.5 | 1.27 | 1.27 | 3.27 | 2.75-3.25 | 2.75-3.5 | 1.5 | 1.54 | 1.75 | 1.17 | 2 | 1 | 1.1-1.7 | add_n |
sub_n |
4 | 2 | 2.25 | 2.63 | 2.18 | 2.14 | 2.13 | 1.25 | 4.5 | 17 | 3 | 2.125 | 1.25 | 4.33 | 2.75-3.5 | 3.25 | 2.5 | 1.27 | 1.27 | 3.27 | 2.75-3.25 | 2.75-3.5 | 1.5 | 1.55 | 1.75 | 1.17 | 2 | 1 | 1.1-1.7 | sub_n |
addlsh1_n |
5 | 3 | 2.9 | 3.5 | 2.45 | 3.77 |
2.25 | 1.71 | 21 | (3.25) | 4 | 1.5 | 5.75 | 5 | 4.35 | 3.11 | a3.36 | a2.25 | 3.27 | 3.25-3.75 | 3.25-4 | 2.18[1.9] |
2.54[1.9] |
2.2 | 1.77 | 2.5 | 1 | 1.45-1.75 | addlsh1_n |
|
sublsh1_n |
5 | 3 | 2.9 | 3.5 | 2.45 | 3.77 |
2.25 | 1.71 | 21 | (3.75) | 4 | 1.5 | 6.5 | 5.7 | 5 | 3.7 | a3.69 | 2.25 |
3.27 | 3.25-3.75 | 3.25-4 | 2.18[1.9] |
2.54[1.9] |
2.2 | 1.77 | 2.5 | 1 | 1.45-1.75 | sublsh1_n |
|
rsblsh1_n |
[5] |
3 | 2.9 | 3.5 | 2.45 | 3.77 |
2.25 | 1.71 | 21 | (3.75) | 1.5 | 2.25 |
3.27 | 3.25-3.75 | 3.25-4 | 2.18[1.9] |
2.54[1.9] |
2.2 | 1.77 | 2.5 | 1 | 1.45-1.75 | rsblsh1_n |
|||||||
addlsh2_n |
[5] |
3 | 2.9 | 3.5 | 2.45 | 3.77 |
2.25 | 1.71 | 21 | 3.75 | 1.5 | a2.25 | 3.27 | 3.25-3.75 | 3.25-4 | 2.18[1.9] |
2.54[1.9] |
2.2 | 1.77 | 2.5 | 1 | 1.45-1.75 | addlsh2_n |
|||||||
sublsh2_n |
[5] |
3 | 2.9 | 3.5 | 2.45 | 3.77 |
2.25 | 1.71 | 21 | 3.75 | 1.5 | 2.25 |
3.27 | 3.25-3.75 | 3.25-4 | 2.18[1.9] |
2.54[1.9] |
2.2 | 1.77 | 2.5 | 1 | 1.45-1.75 | sublsh2_n |
|||||||
rsblsh2_n |
[5] |
3 | 2.9 | 3.5 | 2.45 | 3.77 |
2.25 | 1.71 | 21 | 1.5 | 2.25 |
3.27 | 3.25-3.75 | 3.25-4 | 2.18[1.9] |
2.54[1.9] |
2.2 | 1.77 | 2.5 | 1 | 1.45-1.75 | rsblsh2_n |
||||||||
addlsh_n |
4 | (1.75) | addlsh_n |
|||||||||||||||||||||||||||
sublsh_n |
4 | (1.75) | sublsh_n |
|||||||||||||||||||||||||||
rsblsh_n |
(4.5) | (1.75) | rsblsh_n |
|||||||||||||||||||||||||||
lshsub_n |
lshsub_n |
|||||||||||||||||||||||||||||
add_n_sub_n |
(3) | 2.25 | 1.85 | (3) | 2.25 | [1.3] |
1.75-2 | add_n_sub_n |
||||||||||||||||||||||
rsh1add_n |
(5) | 2.9 | ? | 3.5 | 2.25 | 4 |
2.09 | 1.74 | (4) | (3.5) | 1.5 | 6.75 | 5.35 | 6.4[4.75] |
3.64-3.7 | 3.72 | 2.5[2] |
4.27 | 3.25-3.75 | 3.25-4 | 2.15[1.9] |
2.3[1.9] |
2.2 | 1.70 | 2.75 | 1 | 1.4-1.7 | rsh1add_n |
||
rsh1sub_n |
(5) | 2.9 | ? | 3.5 | 2.25 | 4 |
2.09 | 1.74 | (4.5) | (3.5) | 1.5 | 6.75 | 5.35 | 6.4[4.75] |
3.64-3.7 | 3.72 | 2.5[2] |
4.27 | 3.25-3.75 | 3.25-4 | 2.15[1.9] |
2.3[1.9] |
2.2 | 1.70 | 2.75 | 1 | 1.4-1.7 | rsh1sub_n |
||
cnd_add_n |
2.25 | ? | 3 | 2 | 2.84[2.25] |
2.07 | 1.58 | 3 | 1.5 | 5.5 | 3.86-4.5 | 3.75 | 3 | 1.78 | 1.78 | 3.27 | 3.5-4 | 3.5-4 | 1.75 | 1.82 | 2.1 | 1.29 | 2 | 1 | 1.45-1.75 | cnd_add_n |
||||
cnd_sub_n |
2.25 | ? | 3 | 2 | 2.84[2.25] |
2.07 | 1.58 | 3 | 1.5 | 5.5 | 3.86-4.5 | 3.75 | 3 | 1.78 | 1.78 | 3.27 | 3.5-4 | 3.5-4 | 1.75 | 1.82 | 2.1 | 1.29 | 2 | 1 | 1.45-1.75 | cnd_sub_n |
||||
mul_1 |
6 | 7.25 | 7.25 | 13.5 | 2.9 | 3.53 | 2.47 | 1.5 | 18.25 | 68 | 3 | 2.25 | 2{1.5} |
5 | 5.25[3.75] |
7[5] |
3.25 | 2.25[2] |
2.25{1.35} |
3.27 | 7.5-8 | 7.5-8 | 7 | 7 | 6 | 7 | 4 | 1 | 2.25 | mul_1 |
mul_1c |
Y | Y | Y | Y | Y | Y | Y | N | N | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | mul_1c |
|||||||||
addmul_1 |
9.5 | 8 | 8 | 12.25 | 3.77 | 4.15 | 2.5 | 1.5 | 17.3 | 74 | 4.5(4.25) | 3.5 | 2(1.75) | 5 | 5.25 | 7 | 3.25 | 2 | 2{1.65} |
3.27 | 9.3-9.8 | 9.25-9.75 | 7 | 7 | 6 | 7 | 5 | 1.25 | 2.35 | addmul_1 |
addmul_1c |
Y | Y | Y | addmul_1c |
||||||||||||||||||||||||||
submul_1 |
10.5 | 8.3 | 8.25 | 12.8 | 4.9{4.3} |
4.8 | 2.63 | 1.9 | 22.75 | 74 | 4.5 | 3.5 | 2.25(2) | 6.25 | 6.5 | 7 | 3.75 | 2.32 | 2.32(1.8) | 3.5 | 9.3-9.8 | 9.25-9.75 | 7 | 7 | 6 | 7 | 5 | 1.25 | 2.35 | submul_1 |
mul_2 |
(4.75) | (4.75) | 3 | 3.85 | 1.58 | 1 | 3.25(3) | (2.5) | 1.5 | 3.63 | 3.15 | 5 |
2.25 | 2.5{2} |
2.5{1} |
2.13 | 2.1 | mul_2 |
||||||||||||
mul_3 |
[1.333] |
mul_3 |
||||||||||||||||||||||||||||
mul_4 |
2.625(2.5) | mul_4 |
||||||||||||||||||||||||||||
addmul_2 |
(4.75) | (4.75) | 3 | a4.27 | 1.63 | 1 | 10.25 | 3.75(3.5) | (3) | 1.625 | 3.63 | 3.65 | 4 | 2.25 | 2.5{2} |
2.5{1.3} |
2.13 | 2.2 | addmul_2 |
|||||||||||
addmul_3 |
(4) | (4) | (3) | {1.42} |
3.28 | 3.25 | 3.18 | 2.1 | 2 | a2 | 2.11 | addmul_3 |
||||||||||||||||||
addmul_4 |
2.75 | addmul_4 |
||||||||||||||||||||||||||||
mul_basecase |
(2) | 8.38¹ | 8.3¹ | 13.4¹ | 4.02¹ |
4.35 | 1.75¹ | Y | (8) | (2.31) | 1.95(1+ε) | Y |
mul_basecase |
|||||||||||||||||
mullo_basecase |
mullo_basecase |
|||||||||||||||||||||||||||||
mulmid_basecase |
mulmid_basecase |
|||||||||||||||||||||||||||||
mulhi_basecase |
mulhi_basecase |
|||||||||||||||||||||||||||||
sqr_basecase |
8.96² | 8.67² | 18.5² | 4.35² |
5 | 1.86² | Y | (8) | 2.12(1+ε) | 4.37² | 4.96² | 2.38 | 2.5 |
2.5 |
1.41² | Y |
sqr_basecase |
|||||||||||||
sqrlo_basecase |
sqrlo_basecase |
|||||||||||||||||||||||||||||
sqr_diag_addlsh1 |
6 | 4.5? | 4.5 | 2 | 5.65 | 3.5 | 3.53 | 3.54 | 3.38 | 1 | sqr_diag_addlsh1 |
|||||||||||||||||||
sbpi1_bdiv_r |
sbpi1_bdiv_r |
|||||||||||||||||||||||||||||
redc_1 |
redc_1 |
|||||||||||||||||||||||||||||
redc_2 |
redc_2 |
|||||||||||||||||||||||||||||
lshift |
2.25(1) | 2.33 | 2.25 | 4 | 2.15 | 1.67 | 1.53 | 1.16 | 2.5 | 17.5 | 3 | 1.75 | 1 | 3.5 | 2.75 | 2.5 | 3 | 2.92{1.9} |
1.5{1.15} |
1.53 | 3.5-4[2.46] |
4 | 2[1.5] |
2.04 | 2.25 | 1.45 | 2.67[1.5] |
0.75{0.5} |
1.25 | lshift |
rshift |
2.25(1) | 2.33 | 2.25 | 3.5 | 2.15 | 1.67 | 1.48 | 1.16 | 2.5 | 17.5 | 3 | 1.75 | 1 | 3.5 | 2.75 | 2.5 | 3 | 2.92{1.9} |
1.5{1.15} |
1.53 | 3.5-4[2.46] |
4 | 2[1.5] |
2.04 | 2.25 | 1.45 | 2.67[1.5] |
0.75{0.5} |
1.25 | rshift |
lshiftc |
2.25 | 2.33 | 2.25 | 4 | 2.15 | 1.67 | 1.49 | 1.16 | 2.67 | 17 | 3.5 | 1.25 | 4 | 3.75 | 2.75 | 3.5 | 3.53(2.5) | 1.75(1.4) | 1.78 | 3.5-4 | 4 | 2 | 2.04 | 2.25 | 1.45 | 2.67 | 0.75{0.5} |
1.25 | lshiftc |
|
copyd |
0.75 | 1 |
1.13 | 1.9 | 1.09 | 1 | 1 | 0.42 | 2.5 | 17 | 2 | 1 | 0.5 | 2.5 | 1.5-2 | 1.75[1] |
1.25-1.5 | 1.25 | 0.52 | 0.90 | 1.8 | 1.33 | 1 | 1 | 1.1-1.35 | 0.53 | 1 | 0.31 | 0.62 | copyd |
copyi |
0.75 | 1 |
1 | 2 | 1.09 | 1 | 1 | 0.42 | 2.5 | 17 | 2 | 1 | 0.5 | 2.5 | 1.5-2 | 1.75[1] |
1.25-1.5 | 1.25 | 0.52 | 0.90 | 1.8 | 1.33 | 1 | 1 | 1.1-1.35 | 0.53 | 1 | 0.31 | 0.55 | copyi |
sec_tabselect |
2 | 2 | ? | 5 | 1.37 | 1.37 | ? | 3 | 17 | 2.25? | 1.64 | 2.5 |
1.15 | 2.2 | 0.65 | 2.25 | 1.33 | 1.28 | 1.35 | 2 | 0.32 | 0.55 | sec_tabselect |
|||||||
com |
(0.75) | 1.25 | ? | 1.32 | 1.13 | 1.04 | 1.01 | 0.61 | 1.5 | (0.5) | 3.25 | 2.5-3 | 2.25[1.25] |
1.75 | 1 | 0.65 | 0.77 | 2.25 | 2.27 | 1.25 | 1.2 | 1.25 | 0.84 | 1.75 | 0.5{0.31} |
0.55 | com |
|||
and_n |
1.14 | 2 | 2 | 2.5 | 1.75 | 1.75 | 1.47 | 1 | (1.75) | 1 | 4.25 | 3-3.75 | 3 | 2.1 | 1.27 | 1.27 | 1.84 | 2.75-3.25 | 2.75-3.5 | 1.5 | 1.54 | 1.9 | 1.1 | 2 | 0.57 | 0.75-1.35 | and_n |
|||
ior_n |
1.14 | 2 | 2 | 2.5 | 1.75 | 1.75 | 1.47 | 1 | (1.75) | 1 | 4.25 | 3-3.75 | 3 | 2.1 | 1.27 | 1.27 | 1.84 | 2.75-3.25 | 2.75-3.5 | 1.5 | 1.54 | 1.9 | 1.1 | 2 | 0.57 | 0.75-1.35 | ior_n |
|||
xor_n |
1.14 | 2 | 2 | 2.5 | 1.75 | 1.75 | 1.47 | 1 | (1.75) | 1 | 4.25 | 3-3.75 | 3 | 2.1 | 1.27 | 1.27 | 1.84 | 2.75-3.25 | 2.75-3.5 | 1.5 | 1.54 | 1.9 | 1.1 | 2 | 0.57 | 0.75-1.35 | xor_n |
|||
andn_n |
1.14 | 2 | 2 | 2.5 | 1.75 | 1.75 | 1.47 | 1 | (1.75) | 1 | 4.25 | 3-3.75 | 3 | 2.1 | 1.27 | 1.27 | 1.84 | 2.75-3.25 | 2.75-3.5 | 1.5 | 1.54 | 1.9 | 1.1 | 2 | 0.57 | 0.75-1.35 | andn_n |
|||
iorn_n |
1.39 | 2 | 2 | 2.5 | 1.75 | 1.75 | 1.47 | 1 | (1.75) | 1 | 5.25 | 4-4.75 | 3.5 | 2.1 | 1.64 | 1.64 | 2.34 | 2.75-3.25 | 2.75-3.5 | 1.5 | 1.54 | 2 | 1.1 | 2 | 0.57 | 0.75-1.35 | iorn_n |
|||
xnor_n |
1.39 | 2 | 2 | 2.5 | 1.75 | 1.75 | 1.47 | 1 | (1.75) | 1 | 5.25 | 4-4.75 | 3.5 | 2.6 | 1.64 | 1.64 | 2.34 | 2.75-3.25 | 2.75-3.5 | 1.5 | 1.54 | 2 | 1.1 | 2 | 0.57 | 0.75-1.35 | xnor_n |
|||
nand_n |
1.39 | 2 | 2 | 2.5 | 1.75 | 1.75 | 1.47 | 1 | (2) | 1 | 5.25 | 4-4.75 | 3.5 | 2.6 | 1.64 | 1.64 | 2.34 | 3.25-3.75 | 3.25-4 | 2 | 2.04 | 2.2 | 1.33 | 2.09 | 0.69 | 0.75-1.35 | nand_n |
|||
nior_n |
1.14 | 2 | 2 | 2.5 | 1.75 | 1.75 | 1.47 | 1 | (2) | 1 | 5.25 | 4-4.75 | 3.5 | 2.6 | 1.64 | 1.64 | 2.34 | 3.25-3.75 | 3.25-4 | 2 | 2.04 | 2.2 | 1.33 | 2.09 | 0.69 | 0.75-1.35 | nior_n |
|||
divrem_1int |
[21] |
29 | 29 | 58 | 25 | 25-30 | 24-25 | [22] |
30[22] |
13.5-14.5 | 12 | 18-19 | 13-14 | 11.4-11.8 | 11.4-11.8 | 11-14 | 19 | 10 | divrem_1int |
|||||||||||
divrem_1frc |
[7] |
19 | 19 | 41 | 14 | 14.8 | 14.8 | [18] |
30[22] |
12.75 | 11 | 18 | 13 | 11 | 11 | ? | divrem_1frc |
|||||||||||||
pre_divrem_1 |
Y | Y | Y | Y | Y | Y | pre_divrem_1 |
|||||||||||||||||||||||
div_qr_1u_pi1 |
div_qr_1u_pi1 |
|||||||||||||||||||||||||||||
div_qr_1n_pi1 |
div_qr_1n_pi1 |
|||||||||||||||||||||||||||||
div_qr_1u_pi2 |
div_qr_1u_pi2 |
|||||||||||||||||||||||||||||
div_qr_1n_pi2 |
{22} |
{23.5} |
[16] |
div_qr_1n_pi2 |
||||||||||||||||||||||||||
divrem_2 |
29 | 40 | 37 | 62 | 30.5 | 31.3 | 30.8 | 29 | 29 | divrem_2 |
||||||||||||||||||||
div_qr_2n_pi2 |
div_qr_2n_pi2 |
|||||||||||||||||||||||||||||
divexact_1 |
[6-8] |
16 | 16 | 46 | 12 | 12 | 12 | 26 | 15 | 8 | 8.5-12.5 | 8-12 | 13-14 | 9-10 | 7 | 7 | 8-9 | divexact_1 |
||||||||||||
bdiv_q_1_pi1 |
[6-8] |
16 | 16 | 46 | 12 | 12 | ? | 26 | 15 | 8 | 8.5-12.5 | 8-12 | 13-14 | 9-10 | 7 | 7 | 8-9 | 12-15 | 12-15 | 12 | 12 | 12 | 10 | 11 | 7 | bdiv_q_1_pi1 |
||||
bdiv_qr_1_pi2 |
bdiv_qr_1_pi2 |
|||||||||||||||||||||||||||||
mode1o |
8-10 |
16 | 16 | 35 | 12 | 12 | 12.3 | 26 | 15 | 8 | 7.75 | 7 | 13 | 9 | 7 | 7 | 7 | mode1o |
||||||||||||
diveby3 |
6 | diveby3 |
||||||||||||||||||||||||||||
bdiv_dbm1c |
6.25 | 8.25 | 8.63 | 15 | 4.7 | 4.7 | 4.49 | 2.9 | 4 | 3 | 2 | 5 | 5.25 | 5.25 | 4.25 | 2.5 | 2.5 | 5 | 8 | 8.25 | 7 | 7 | 6 | 7 | 4.25 | 2 | 5 | bdiv_dbm1c |
||
mod_1_1p |
17 | 16 | 30 | 10.2 | 10.1 | 10.6 | 7.7 | (9) | 10 | 8 | 9 | 7 | 6 | 6 | 10 | mod_1_1p |
||||||||||||||
mod_1s_2p |
(4.5) | 4.75 | 4 | 9 | 4.25 | 3 | 3 | 4.86 | mod_1s_2p |
|||||||||||||||||||||
mod_1s_3p |
mod_1s_3p |
|||||||||||||||||||||||||||||
mod_1s_4p |
[6.5] |
9 | 9 | 13 | 3.5 | 4 | 2.84 | 2 | 4 | 3 | (2.25) | mod_1s_4p |
||||||||||||||||||
mod_34lsub1 |
0.87 | 1.5 | 1.32 | 2.35 | 1 | 1.67 | 0.86 | 0.78 | 1.67? | 1.67 |
1 | 2.5 | 2.33 | 2 | 1.33{1} |
1.33{0.92} |
1.33{0.59} |
1.6 | 2[1.67] |
2[1.67] |
1 | 1.03[0.79] |
1.35[1.0] |
1 | 1.45[1.17] |
1[0.43] |
1.3{0.55} |
mod_34lsub1 |
||
gcd_11 |
8.5/b | ? | 10.1/b | 7.6/b | 11.3/b | 5.75/b | 4.3/b | 5/b | 11.4/b | 6/b | 3.4/b | 4.5/b | 5.2/b | 5.0/b | 3.59/b | ?/b | 3.05/b | 3.05/b | 5.25/b | 3.62/b | 3.7/b | 2.91/b | 2.85/b | 3.59/b | 2.7/b | 5.1/b | 3.0/b | 10.5 |
gcd_11 |
|
gcd_22 |
12.3/b | 13.4/b | 9.6/b | 8.1/b | 10.1/b | 9.1/b | 6.3/b | ?/b | 5.7/b | 5.7/b | 7.7/b | 7.3/b | 7.2/b | 5.7/b | 5.7/b | 6.4/b | 5.2/b | 9.2/b | 4.5/b | 12 | gcd_22 |
|||||||||
invert_limb |
32 | 86 | 86 | 170 | 66 | ? | ? | ? | 71 | 56 | 41 | 42 | 56 | 43 | 41 | 41 | 43 | 56 | 57 | 57 | 57 | 63 | 40 | invert_limb |
||||||
popcount |
1.125 | 2.25 | {2.16} |
2 | 2.9 | 1.57 | 0.86 | 2.5 | 1.5 |
1 | 2.5 | 2.15-2.64 | 1 | 1.13 | 5.67 | 0.56 | 0.75 | 2.5 | 2.5 | 1.14 | 1.17 | 1.25 | 0.5 | 3 | 0.5 | 0.66 | popcount |
|||
hamdist |
(1.5) | (3) | 2.87 | 3.1 | 1.91 | 1.34 | 3.5 | 2.4 |
1 | 4.25 | 3.15-4.15 | 1.5 | 1.89 | 6.44 | 0.95 | 1.3 | 4.5 | 4 | 1.9 | 1.94 | 1.85 | 0.94 | 4.36 | 0.6 | 1 | hamdist |
||||
PPC 74x7 32 |
PPC 970 64 |
IBM PWR5 64 |
IBM PWR6 64 |
IBM PWR7 64 |
IBM PWR8 64 |
IBM PWR9 64 |
IBM PWR10 64 |
Sun US3 64 |
Sun T1 64 |
Sun T4 64 |
Alpha 21264 64 |
Itanium 2 64 |
ARM a5 neon 32 |
ARM a7 neon 32 |
ARM a8 neon 32 |
ARM a9 neon 32 |
ARM a15 32 |
ARM a15 neon 32 |
ARM a17 neon 32 |
ARM a53 64 |
ARM a55 64 |
ARM a57 64 |
ARM a72 64 |
ARM a73 64 |
ARM a76 64 |
ARM X‑Gene 64 |
ARM Apple M1 64 |
Zarch IBM z15 64 |
¹ This value is for sizes around MUL_TOOM22_THRESHOLD, since mpn_mul_basecase is in most cases not used above that.
² This value is for sizes around SQR_TOOM2_THRESHOLD, since mpn_sqr_basecase is never used above that.
† Obsolete function that will be replaced in a future GMP release.