64-bit vs 32-bit GMP code

Torbjörn Granlund tg at gmplib.org
Fri Feb 16 16:48:38 UTC 2018


GMP's 64-bit inner loops or 64-bit x86 processors are well-tuned in most
cases, but what's the state of the code for the 32-bit ABI?  Let's check
the key function mpn_mul_basecase.

The tables below are for some smallish sizes (1-24) with 32-bit numbers
to the left and 64-bit numbers to the right.  It should be reasonably
straightforward to make the 32-bit numbers <= the 64-bit numbers.  (Note
that the 64-bit code for size k works with twice as many bits; for
performance parity the 32-bit numbers should be 1/4 of the 64-bit
numbers.)

The lack of registers might make it hard in some cases when writing
32-bit code, but then we also have 32x32->64 SIMD instructions which
could allow 32-bit code to perform quite well.

sky
overhead 3.83 cycles, precision 100 overhead 3.87 cycles, precision 100
	mpn_mul_basecase	    	    mpn_mul_basecase
1		17.73		    1		     5.76
2		21.24		    2		     8.64
3		30.03		    3		    31.77
4		52.08		    4		    42.13
5		71.18		    5		    58.29
6		91.29		    6		    74.64
7	       118.18		    7		    99.14
8	       149.42		    8		   122.40
9	       185.41		    9		   154.45
10	       228.36		    10		   193.09
11	       269.21		    11		   228.16
12	       317.49		    12		   270.28
13	       372.15		    13		   310.56
14	       430.09		    14		   350.19
15	       486.97		    15		   401.69
16	       552.88		    16		   443.84
17	       623.82		    17		   497.54
18	       692.53		    18		   551.11
19	       774.26		    19		   650.64
20	       871.09		    20		   705.41
21	       944.87		    21		   768.52
22	      1048.24		    22		   858.98
23	      1140.10		    23		   944.86
24	      1245.76		    24		   996.47

bwl
overhead 3.61 cycles, precision 100 overhead 4.50 cycles, precision 100
	mpn_mul_basecase	    	    mpn_mul_basecase
1		23.04		    1		     5.41
2		25.45		    2		     8.14
3		37.11		    3		    32.26
4		59.40		    4		    42.35
5		76.94		    5		    58.28
6		98.92		    6		    75.38
7	       124.27		    7		   101.34
8	       156.98		    8		   128.51
9	       193.90		    9		   156.83
10	       231.27		    10		   186.24
11	       273.86		    11		   226.21
12	       321.92		    12		   263.49
13	       373.56		    13		   304.67
14	       428.50		    14		   345.56
15	       485.08		    15		   391.01
16	       563.34		    16		   458.92
17	       617.93		    17		   506.08
18	       683.42		    18		   550.50
19	       759.70		    19		   645.58
20	       852.90		    20		   704.35
21	       949.14		    21		   774.62
22	      1017.56		    22		   843.40
23	      1102.98		    23		   911.01
24	      1179.95		    24		  1005.32

hwl
overhead 3.64 cycles, precision 100 overhead 4.79 cycles, precision 100
	mpn_mul_basecase	    	    mpn_mul_basecase
1		24.07		    1		     9.08
2		26.46		    2		    15.61
3		38.34		    3		    38.99
4		59.90		    4		    49.90
5		78.41		    5		    68.69
6	       100.12		    6		    89.16
7	       125.18		    7		   117.71
8	       158.85		    8		   145.89
9	       194.14		    9		   186.32
10	       230.63		    10		   224.55
11	       273.01		    11		   265.96
12	       323.90		    12		   310.32
13	       374.37		    13		   370.06
14	       426.48		    14		   420.78
15	       483.05		    15		   478.04
16	       547.90		    16		   538.98
17	       614.29		    17		   615.87
18	       677.35		    18		   680.80
19	       752.81		    19		   752.47
20	       843.95		    20		   828.32
21	       932.43		    21		   947.38
22	      1014.50		    22		  1004.41
23	      1094.76		    23		  1091.09
24	      1178.47		    24		  1201.34

sbr
overhead 5.46 cycles, precision 100 overhead 5.45 cycles, precision 100
	mpn_mul_basecase	    	    mpn_mul_basecase
1		22.83		    1		    10.05
2		28.41		    2		    18.29
3		41.36		    3		    43.78
4		66.51		    4		    58.41
5		86.38		    5		    83.44
6	       112.78		    6		   113.80
7	       143.03		    7		   154.55
8	       177.99		    8		   190.80
9	       217.68		    9		   249.61
10	       256.29		    10		   298.84
11	       306.04		    11		   356.44
12	       357.01		    12		   411.06
13	       408.58		    13		   500.64
14	       469.01		    14		   576.88
15	       531.21		    15		   649.64
16	       596.16		    16		   718.06
17	       674.41		    17		   832.25
18	       749.42		    18		   953.98
19	       827.11		    19		  1025.89
20	       925.26		    20		  1114.01
21	      1014.86		    21		  1256.97
22	      1102.14		    22		  1376.01
23	      1204.53		    23		  1488.91
24	      1299.31		    24		  1592.39

nhm
overhead 5.53 cycles, precision 100 overhead 5.53 cycles, precision 100
	mpn_mul_basecase	    	    mpn_mul_basecase
1		19.35		    1		    15.39
2		26.92		    2		    24.97
3		41.33		    3		    49.90
4		74.74		    4		    72.99
5	       107.22		    5		   111.46
6	       141.13		    6		   152.27
7	       178.93		    7		   203.33
8	       249.93		    8		   260.52
9	       291.99		    9		   325.47
10	       347.75		    10		   399.40
11	       407.40		    11		   486.28
12	       471.57		    12		   568.98
13	       590.96		    13		   673.04
14	       617.92		    14		   770.43
15	       745.61		    15		   883.54
16	       813.53		    16		  1010.44
17	       867.56		    17		  1132.38
18	       974.72		    18		  1268.15
19	      1179.54		    19		  1407.04
20	      1311.75		    20		  1559.93
21	      1427.09		    21		  1714.40
22	      1508.94		    22		  1887.66
23	      1529.68		    23		  2045.38
24	      1803.39		    24		  2237.08

pnr
overhead 6.06 cycles, precision 100 overhead 6.06 cycles, precision 100
	mpn_mul_basecase	    	    mpn_mul_basecase
1		21.23		    1		    14.16
2		36.16		    2		    25.29
3		53.57		    3		    53.60
4		92.92		    4		    83.72
5	       133.46		    5		   125.40
6	       162.15		    6		   172.76
7	       223.63		    7		   234.54
8	       266.18		    8		   295.01
9	       351.44		    9		   374.13
10	       397.70		    10		   450.58
11	       480.81		    11		   544.37
12	       573.12		    12		   639.29
13	       657.97		    13		   752.05
14	       762.29		    14		   869.50
15	       861.01		    15		   992.66
16	       968.78		    16		  1124.44
17	      1095.23		    17		  1269.71
18	      1221.56		    18		  1415.05
19	      1351.80		    19		  1581.17
20	      1527.94		    20		  1716.28
21	      1647.99		    21		  1921.19
22	      1808.22		    22		  2095.04
23	      2002.19		    23		  2298.09
24	      2144.02		    24		  2484.15

cnr
overhead 6.09 cycles, precision 100 overhead 6.09 cycles, precision 100
	mpn_mul_basecase	    	    mpn_mul_basecase
1		21.34		    1		    14.22
2		36.25		    2		    25.63
3		53.88		    3		    53.58
4		88.84		    4		    84.40
5	       131.04		    5		   127.57
6	       164.41		    6		   172.38
7	       211.69		    7		   233.73
8	       272.97		    8		   296.48
9	       352.80		    9		   374.42
10	       402.66		    10		   457.72
11	       483.17		    11		   548.49
12	       585.57		    12		   643.79
13	       661.58		    13		   760.53
14	       775.12		    14		   861.97
15	       876.44		    15		   997.16
16	       985.18		    16		  1132.71
17	      1108.62		    17		  1275.92
18	      1236.09		    18		  1419.61
19	      1387.57		    19		  1591.41
20	      1547.17		    20		  1732.12
21	      1656.25		    21		  1930.03
22	      1825.50		    22		  2107.50
23	      2013.27		    23		  2310.75
24	      2162.06		    24		  2494.36

bay
overhead 3.03 cycles, precision 100 overhead 3.03 cycles, precision 100
	mpn_mul_basecase	    	    mpn_mul_basecase
1		40.30		    1		    29.28
2		72.72		    2		    54.53
3	       119.20		    3		   103.99
4	       169.68		    4		   155.51
5	       248.45		    5		   277.48
6	       315.08		    6		   373.64
7	       464.57		    7		   488.52
8	       564.60		    8		   605.87
9	       678.73		    9		   756.33
10	       810.46		    10		   899.51
11	       969.54		    11		  1085.65
12	      1097.85		    12		  1254.57
13	      1254.41		    13		  1488.69
14	      1446.08		    14		  1684.82
15	      1642.15		    15		  1950.22
16	      1823.07		    16		  2177.77
17	      2023.97		    17		  2484.44
18	      2267.26		    18		  2737.17
19	      2510.57		    19		  3072.79
20	      2734.67		    20		  3359.35
21	      2979.46		    21		  3738.52
22	      3274.07		    22		  4048.76
23	      3564.82		    23		  4454.06
24	      3832.79		    24		  4798.70

zen
overhead 4.67 cycles, precision 100 overhead 4.65 cycles, precision 100
	mpn_mul_basecase	    	    mpn_mul_basecase
1		10.65		    1		     4.65
2		20.28		    2		    12.17
3		48.77		    3		    29.74
4		68.65		    4		    40.62
5		95.49		    5		    57.68
6	       123.14		    6		    77.52
7	       161.53		    7		    99.88
8	       204.41		    8		   126.19
9	       246.08		    9		   165.96
10	       293.37		    10		   200.30
11	       421.92		    11		   239.01
12	       469.59		    12		   311.81
13	       539.07		    13		   339.11
14	       598.63		    14		   390.01
15	       593.66		    15		   464.13
16	       665.57		    16		   552.27
17	       752.32		    17		   584.10
18	       838.21		    18		   645.89
19	       929.49		    19		   686.08
20	      1039.34		    20		   760.87
21	      1115.45		    21		   852.36
22	      1224.76		    22		   921.91
23	      1353.20		    23		  1014.00
24	      1468.26		    24		  1085.29

exca
overhead 5.87 cycles, precision 100 overhead 4.01 cycles, precision 100
	mpn_mul_basecase	    	    mpn_mul_basecase
1		13.19		    1		    12.01
2		23.01		    2		    30.82
3		56.16		    3		    71.70
4		82.38		    4		    91.54
5	       136.22		    5		   138.29
6	       186.74		    6		   193.43
7	       239.87		    7		   256.03
8	       302.59		    8		   337.80
9	       357.96		    9		   402.94
10	       408.52		    10		   493.09
11	       496.63		    11		   604.43
12	       574.63		    12		   677.93
13	       662.49		    13		   778.87
14	       754.49		    14		   891.70
15	       868.56		    15		  1013.03
16	       904.22		    16		  1231.66
17	      1020.40		    17		  1397.35
18	      1179.57		    18		  1482.60
19	      1341.73		    19		  1669.62
20	      1347.79		    20		  1833.88
21	      1501.84		    21		  2155.88
22	      1613.86		    22		  2361.74
23	      1770.83		    23		  2420.84
24	      1873.31		    24		  2800.91

pile
overhead 5.79 cycles, precision 100 overhead 5.79 cycles, precision 100
	mpn_mul_basecase	    	    mpn_mul_basecase
1		14.03		    1		    13.12
2		29.62		    2		    27.99
3		69.50		    3		    63.21
4	       100.87		    4		    97.62
5	       149.58		    5		   136.08
6	       195.80		    6		   185.62
7	       248.57		    7		   247.54
8	       306.08		    8		   307.29
9	       370.54		    9		   401.54
10	       446.01		    10		   485.52
11	       514.41		    11		   564.89
12	       600.85		    12		   662.96
13	       690.32		    13		   797.47
14	       799.69		    14		   908.52
15	       909.04		    15		  1034.96
16	      1043.78		    16		  1174.45
17	      1144.01		    17		  1343.91
18	      1275.35		    18		  1512.24
19	      1423.85		    19		  1656.83
20	      1536.26		    20		  1819.50
21	      1691.20		    21		  2078.82
22	      1860.72		    22		  2201.40
23	      1993.33		    23		  2415.29
24	      2163.74		    24		  2566.52

tutu
overhead 5.79 cycles, precision 100 overhead 5.81 cycles, precision 100
	mpn_mul_basecase	    	    mpn_mul_basecase
1		16.48		    1		    13.61
2		33.34		    2		    31.89
3		77.74		    3		    73.65
4	       119.47		    4		   111.27
5	       166.26		    5		   152.44
6	       214.44		    6		   191.01
7	       262.03		    7		   261.30
8	       312.70		    8		   311.57
9	       414.53		    9		   413.66
10	       465.66		    10		   492.64
11	       547.32		    11		   585.00
12	       632.07		    12		   699.12
13	       784.90		    13		   834.63
14	       850.57		    14		   960.98
15	       976.03		    15		  1062.95
16	      1074.15		    16		  1252.71
17	      1241.83		    17		  1398.59
18	      1304.93		    18		  1522.47
19	      1465.31		    19		  1677.83
20	      1581.26		    20		  1878.54
21	      1776.59		    21		  2100.60
22	      1893.79		    22		  2268.86
23	      2079.75		    23		  2416.21
24	      2281.34		    24		  2665.11

king
overhead 5.45 cycles, precision 100 overhead 5.44 cycles, precision 100
	mpn_mul_basecase	    	    mpn_mul_basecase
1		13.61		    1		    15.38
2		26.33		    2		    23.50
3		60.92		    3		    41.71
4		91.85		    4		    56.04
5	       124.10		    5		    80.80
6	       171.83		    6		   108.21
7	       223.15		    7		   140.16
8	       277.71		    8		   194.98
9	       332.43		    9		   240.07
10	       403.34		    10		   274.03
11	       480.73		    11		   323.44
12	       549.97		    12		   368.85
13	       657.86		    13		   446.64
14	       752.77		    14		   493.81
15	       845.39		    15		   560.80
16	       955.62		    16		   623.37
17	      1075.40		    17		   727.88
18	      1185.40		    18		   779.03
19	      1313.77		    19		   859.10
20	      1451.28		    20		   941.07
21	      1576.94		    21		  1077.84
22	      1727.44		    22		  1134.27
23	      1889.27		    23		  1232.65
24	      2036.82		    24		  1332.18


-- 
Torbjörn
Please encrypt, key id 0xC8601622


More information about the gmp-devel mailing list