[Gmp-commit] /var/hg/www: 8 new changesets
mercurial at gmplib.org
mercurial at gmplib.org
Thu Sep 26 14:17:33 CEST 2013
details: /var/hg/www/rev/eb83bb47972e
changeset: 94:eb83bb47972e
user: Torbjorn Granlund <tege at gmplib.org>
date: Mon Sep 02 12:37:00 2013 +0200
description:
Sparc updates.
details: /var/hg/www/rev/5cb272aaee9b
changeset: 95:5cb272aaee9b
user: Torbjorn Granlund <tege at gmplib.org>
date: Thu Sep 26 13:43:52 2013 +0200
description:
Warn against autoreconf.
details: /var/hg/www/rev/10782fa4611f
changeset: 96:10782fa4611f
user: Torbjorn Granlund <tege at gmplib.org>
date: Thu Sep 26 14:14:22 2013 +0200
description:
Update to reflect current systems status.
details: /var/hg/www/rev/2d92136540a8
changeset: 97:2d92136540a8
user: Torbjorn Granlund <tege at gmplib.org>
date: Thu Sep 26 14:15:28 2013 +0200
description:
Add section on basecase performance.
details: /var/hg/www/rev/93a5230483b2
changeset: 98:93a5230483b2
user: Torbjorn Granlund <tege at gmplib.org>
date: Thu Sep 26 14:15:43 2013 +0200
description:
Updates to reflect current asm status.
details: /var/hg/www/rev/e4eafe886209
changeset: 99:e4eafe886209
user: Torbjorn Granlund <tege at gmplib.org>
date: Thu Sep 26 14:16:23 2013 +0200
description:
Release plan updates.
details: /var/hg/www/rev/4958db441059
changeset: 100:4958db441059
user: Torbjorn Granlund <tege at gmplib.org>
date: Thu Sep 26 14:17:09 2013 +0200
description:
Re-measure for most systems.
details: /var/hg/www/rev/70be7067dacc
changeset: 101:70be7067dacc
user: Torbjorn Granlund <tege at gmplib.org>
date: Thu Sep 26 14:17:29 2013 +0200
description:
New project file.
diffstat:
devel/asm.html | 33 ++++----
devel/index.html | 53 ++++++++++++++-
devel/repo-usage.html | 6 +-
devel/sparc.html | 6 +-
devel/testsystems.html | 10 +-
devel/x64-64.html | 181 +++++++++++++++++++++++++++++++++++++++++++++++++
gmpbench.html | 58 ++++++++-------
index.html | 9 +-
8 files changed, 301 insertions(+), 55 deletions(-)
diffs (truncated from 570 to 300 lines):
diff -r cbb337e68558 -r 70be7067dacc devel/asm.html
--- a/devel/asm.html Fri Aug 30 14:59:14 2013 +0200
+++ b/devel/asm.html Thu Sep 26 14:17:29 2013 +0200
@@ -109,6 +109,7 @@
<tr> <td> addlsh_n <td> <td> <td> <td> <td> <td> a2.87 <td> a2.75 <td>4.2{3.5}<td>5.46{4.3}<td><strike>15</strike><td>3<td>2.8 <td> 2.75 <td> 2.78 <td> a2.67 <td> 7.75{6}<td> 4.7{4}<td> <td> <td> <td> <td> <td> <td> <td> 4 <td> <td> (1.75) <td> <td> <td>
<tr> <td> sublsh_n <td> <td> <td> <td> <td> <td>{2.5-3.25}<td>{2.5-3.25}<td> <td> <td> <td> {2.75} <td> {2.75} <td> {3} <td> <td> <td> <td> {4.125}<td> <td> <td> <td> <td> <td> <td> <td> 4 <td> <td> (1.75) <td> <td> <td>
<tr> <td> rsblsh_n <td> <td> <td> <td> <td> <td> a2.87 <td> a2.75 <td>4.2{3.5}<td>5.46{4.3}<td><strike>15</strike><td>3<td>2.8 <td> 2.75 <td> 2.78 <td> a2.67 <td> 7.75{6}<td> 4.7{4}<td> <td> <td> <td> <td> <td> <td> <td> (4.5) <td> <td> (1.75) <td> <td> <td>
+<tr> <td> lshsub_n <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td>
<tr> <td> add_n_sub_n <td> <td> <td> <td> <td> <td> [2.5] <td> [2.5] <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> (3) <td> <td> <td> <td> <td> <td> <td> (3) <td> 2.25 <td> <td> <td>
<tr> <td> rsh1add_n <td> <td> 4.5 <td> 5.25 <td> <td> <td> 2 <td> 2{1.67}<td>2.75{2.5}<td>3.25{2.7}<td>5.63 <td>3.1{2.67}<td>3.3{2.5}<td> 2.05 <td> 2.08 <td> 2.04 <td> 5.25 <td> 3 <td> (5) <td> 2.9 <td> ? <td> 2.5 <td> 2.25 <td> <td> <td> (4) <td> (3.5) <td> 1.5 <td>3.64-3.7<td> 3.72 <td> 2.5[2]
<tr> <td> rsh1sub_n <td> <td> <td> <td> <td> <td> 2 <td> 2{1.67}<td>2.75{2.5}<td>3.25{2.7}<td>5.63 <td>3.1{2.67}<td>3.3{2.5}<td> 2.05 <td> 2.08 <td> 2.04 <td> 5.25 <td> 3 <td> (5) <td> 2.9 <td> ? <td> 3.5 <td> 2.25 <td> <td> <td> (4.5) <td> (3.5) <td> 1.5 <td>3.64-3.7<td> 3.72 <td> 2.5[2]
@@ -116,9 +117,9 @@
<tr> <td> cnd_sub_n <td> 3.4 <td> 5 <td> 5.25 <td> 4.67 <td> 5.67 <td> 2 <td> 2 <td> 2.32 <td> 3 <td> 13 <td> 2.9 <td> 2.8 <td> 2.4 <td> 2.4 <td> 2.23 <td> 5.33 <td> 3 <td> <td> 2.25 <td> ? <td> 3 <td> 2 <td> <td> <td> 3 <td> <td> 1.5 <td> 3 <td> 1.78 <td> 1.78
<tr bgcolor="#e8e8e8"><td> mul_1 <td> 3.25 <td> 4 <td> 4.5 <td>4.16{3.75}<td>7.5 <td> 2.5 <td> 2.5 <td> 4.5 <td> 5 <td> 12.6 <td> 4 <td> 3.75 <td> 2.5 <td> 2.4 <td> 1.57 <td> 17.3 <td> 4.25 <td> 6 <td> 7.25 <td> 7.25 <td> 13.5(8)<td> 2.9 <td> 18.25 <td> 68 <td> 3 <td> 2.25 <td> 2{1.5}<td> 3.25 <td> 2.25[2]<td> 2.25{1.35}
<tr bgcolor="#e8e8e8"><td> mul_1c <td> Y <td> Y <td> Y <td> Y <td> Y <td> Y <td> Y <td> Y <td> Y <td> Y <td> Y <td> Y <td> N <td> N <td> N <td> Y <td> Y <td> <td> Y <td> Y <td> Y <td> Y <td> N <td> <td> <td> N <td> [Y] <td> <td> <td>
-<tr bgcolor="#e8e8e8"><td> addmul_1 <td> 3.75 <td> 5{4} <td> 5 <td>5.21{4.75}<td>8 <td> 2.5 <td> 2.5 <td>4.6-4.75<td> 5 <td> 14.9 <td> 4.25 <td> 4.5 <td> 3.25 <td> 3.07 <td> 2.31 <td> 19.37 <td> 5 <td> 9.5 <td> 8 <td> 8 <td> 12.25 <td> 3.77 <td> 17.3 <td> 74 <td>4.5(4.25)<td> 3.5 <td> 2(1.75)<td> 3.25 <td> 2 <td> 2{1.65}
-<tr bgcolor="#e8e8e8"><td> submul_1 <td> 3.75 <td> 6 <td> 6.5 <td> #5.5 <td> 8 <td> 2.5 <td> 2.5 <td>4.6-4.75<td> 5 <td> 14.9 <td> 4.25 <td> 4.5 <td> 3.25 <td> 3.07 <td> 2.31 <td> 19.37 <td> 5 <td> 10.5 <td> 8.3 <td> 8.25 <td> 12.8 <td>4.9{4.3}<td> 22.75 <td> 74 <td> 4.5 <td> 3.5 <td> 2.25(2)<td> 3.75 <td> 2.32 <td> 2.32(1.8)
-<tr> <td> mul_2 <td> <td> (4) <td> (4) <td> <td> <td> 2.25 <td> 2.25 <td> 4.36 <td> #5.62 <td> 13.5 <td> 4 <td> 3.83 <td> 2.57 <td> 2.35 <td> 1.86 <td> 17.75 <td> 4.12 <td> <td> (4.75) <td> (4.75) <td> (5.5) <td> 3 <td> <td> <td> 3.25(3)<td> (3) <td> 1.5 <td> 2.25 <td> #2.5{2}<td> #2.5{1}
+<tr bgcolor="#e8e8e8"><td> addmul_1 <td> 3.75 <td> 5{4} <td> 5 <td>5.21{4.75}<td>8 <td> 2.5 <td> 2.5 <td>4.6-4.75<td> 5 <td> 14.9 <td> 4.25 <td> 4.5 <td> 3.24 <td> 3.04 <td> 2.31 <td> 19.37 <td> 5 <td> 9.5 <td> 8 <td> 8 <td> 12.25 <td> 3.77 <td> 17.3 <td> 74 <td>4.5(4.25)<td> 3.5 <td> 2(1.75)<td> 3.25 <td> 2 <td> 2{1.65}
+<tr bgcolor="#e8e8e8"><td> submul_1 <td> 3.75 <td> 6 <td> 6.5 <td> #5.5 <td> 8 <td> 2.5 <td> 2.5 <td>4.6-4.75<td> 5 <td> 14.9 <td> 4.25 <td> 4.5 <td> 3.24 <td> 3.04 <td> 2.31 <td> 19.37 <td> 5 <td> 10.5 <td> 8.3 <td> 8.25 <td> 12.8 <td>4.9{4.3}<td> 22.75 <td> 74 <td> 4.5 <td> 3.5 <td> 2.25(2)<td> 3.75 <td> 2.32 <td> 2.32(1.8)
+<tr> <td> mul_2 <td> <td> (4) <td> (4) <td> <td> <td> 2.25 <td> 2.25 <td> 4.36 <td> #5.62 <td> 13.5 <td> 4 <td> 3.83 <td> 2.57 <td> 2.35 <td> 1.86 <td> 17.75 <td> 4.12 <td> <td> (4.75) <td> (4.75) <td> (5.5) <td> 3 <td> <td> <td> 3.25(3)<td> (2.5) <td> 1.5 <td> 2.25 <td> #2.5{2}<td> #2.5{1}
<tr> <td> mul_3 <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> [1.333]<td> <td> <td>
<tr> <td> mul_4 <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td>2.625(2.5)<td> <td> [1.25] <td> <td> <td>
<tr> <td> mul_5 <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> [1.2] <td> <td> <td>
@@ -128,19 +129,19 @@
<tr> <td> addmul_4 <td> <td> (3) <td> (3) <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> (2) <td> <td> <td> <td> <td> <td> <td> 2.75 <td> (2.31) <td>{1.3125}<td> <td> <td>
<tr> <td> addmul_6 <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> (1.167)<td> <td> <td>
<tr> <td> addmul_8 <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> (2.25) <td> <td> (1) <td> <td> <td>
-<tr bgcolor="#e8e8e8"><td> mul_basecase <td>3.9[3.75]<td> 4.6¹ <td> 5¹ <td> 5.3¹ <td> 8.9¹ <td> 2.5¹ <td> 2.5¹ <td> 4.84¹ <td> 5.25¹ <td> 15¹ <td> #4.5¹ <td> 4.41¹ <td> 3.1¹ <td> 2.8¹ <td> 2.31¹ <td>#20.5¹ <td> 4.5¹ <td> (2) <td> 8.38¹ <td> 8.3¹ <td> 13.4¹ <td> #4.02¹ <td>(8) <td> <td> <td>(2.31)<td>(1+ε)<td>* <td> * <td> *
-<tr bgcolor="#e8e8e8"><td> mullo_basecase <td> <td> <td> <td> <td> <td> Y <td> Y <td> Y <td> Y <td> Y <td> Y <td> Y <td> Y <td> <td> <td> Y <td> Y <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> * <td> * <td> *
-<tr bgcolor="#e8e8e8"><td> mulmid_basecase <td> <td> <td> <td> <td> <td> Y <td> Y <td> Y <td> Y <td> Y <td> Y <td> Y <td> Y <td> <td> <td> Y <td> Y <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td>
+<tr bgcolor="#e8e8e8"><td> mul_basecase <td>3.9[3.75]<td> 4.6¹ <td> 5¹ <td> 5.3¹ <td> 8.9¹ <td> 2.5¹ <td> 2.5¹ <td> 4.79¹ <td> 5.25¹ <td> <td> 4.28¹ <td> 4.24¹ <td> 3.1¹ <td> 2.8¹ <td> 2.31¹ <td> <td> <td> (2) <td> 8.38¹ <td> 8.3¹ <td> 13.4¹ <td> #4.02¹ <td>(8) <td> <td> <td>(2.31)<td>(1+ε)<td>* <td> * <td> *
+<tr bgcolor="#e8e8e8"><td> mullo_basecase <td> <td> <td> <td> <td> <td> Y <td> Y <td> Y <td> Y <td> <td> Y <td> Y <td> Y <td> Y <td> Y <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> * <td> * <td> *
+<tr bgcolor="#e8e8e8"><td> mulmid_basecase <td> <td> <td> <td> <td> <td> Y <td> Y <td> Y <td> Y <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td>
<tr bgcolor="#e8e8e8"><td> mulhi_basecase <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td>
-<tr bgcolor="#e8e8e8"><td> sqr_basecase <td>3.9[3.75]<td> 5.3² <td> 5.6² <td> 6.0² <td> 9.7² <td> #3.0² <td> #3.0² <td> #5.3² <td> 5.65² <td> 15.8² <td> #5.1² <td> #4.83² <td> 3.32² <td> 3.05² <td> 2.42² <td> #21.8² <td> #4.75² <td> <td> 8.96² <td> 8.67² <td>#18.5² <td> #4.35² <td>(8) <td> <td> <td> <td>(1+ε)<td> 2.38 <td> #2.5 <td> #2.5
-<tr bgcolor="#e8e8e8"><td> sqr_diag_addlsh1<td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> 6 <td> <td> <td> <td> <td> <td> <td> ? <td> ? <td> 2 <td> <td> <td>
-<tr bgcolor="#e8e8e8"><td> redc_1 <td> <td> <td> <td> <td> <td> 2.5 <td> 2.5 <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> * <td> * <td> * <td> * <td> <td> <td> <td> <td> * <td> * <td> * <td> *
+<tr bgcolor="#e8e8e8"><td> sqr_basecase <td>3.9[3.75]<td> 5.3² <td> 5.6² <td> 6.0² <td> 9.7² <td> #3.0² <td> #3.0² <td> #5.24² <td> 5.65² <td> <td> 4.81² <td> 4.54² <td> 3.32² <td> 3.05² <td> 2.42² <td> <td> <td> <td> 8.96² <td> 8.67² <td>#18.5² <td> #4.35² <td>(8) <td> <td> <td> <td>(1+ε)<td> 2.38 <td> #2.5 <td> #2.5
+<tr bgcolor="#e8e8e8"><td> sqr_diag_addlsh1<td> <td> <td> <td> <td> <td> 2.5 <td> 2.5 <td> 3.6 <td> 4 <td> <td> 4 <td> 3.6 <td> 3.13 <td> 3.1 <td> 2.5 <td> 14 <td> 3.5 <td> 6 <td> <td> <td> <td> <td> <td> <td> 4.5? <td> 4.5 <td> 2 <td> <td> <td>
+<tr bgcolor="#e8e8e8"><td> redc_1 <td> <td> <td> <td> <td> <td> 2.5 <td> 2.5 <td> 4.87 <td> 5 <td> <td> 4.25 <td> 4.5 <td> 3.24 <td> 3.04 <td> 2.31 <td> 19.37 <td> <td> <td> * <td> * <td> * <td> * <td> <td> <td> <td> <td> * <td> * <td> * <td> *
<tr bgcolor="#e8e8e8"><td> redc_2 <td> <td> <td> <td> <td> <td> {2.375}<td> {2.375}<td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> * <td> * <td> * <td> *
-<tr> <td> lshift <td> 1.2 <td> 1.75 <td> 2 <td>1.75{1.46}<td>5 <td> 2.35 <td>1.8{1.3}<td> 1.3 <td> 3.16 <td>3.33{2.7}<td> 1.27 <td>1.375[1.25]<td> 1.3 <td> 1.3 <td> 1.17 <td> 4.5 <td> 3.25[2]<td> 2.25(1)<td> 2.33 <td> 2.25 <td> 4 <td> 2.15 <td> 2.5 <td> 17.5 <td> 3 <td> 1.75 <td> 1 <td> 3 <td>2.92(1.9}<td> 1.5{1.15}
-<tr> <td> rshift <td> 1.2 <td> 1.75 <td> 2 <td>1.75{1.46}<td>5 <td> 2.35 <td>1.8{1.3}<td> 1.3 <td> 3.16 <td>3.33{2.7}<td> 1.27 <td>1.375[1.25]<td> 1.3 <td> 1.3 <td> 1.17 <td> 4.5 <td> 3.25{2}<td> 2.25(1)<td> 2.33 <td> 2.25 <td> 3.5 <td> 2.15 <td> 2.5 <td> 17.5 <td> 3 <td> 1.75 <td> 1 <td> 3 <td>2.92{1.9}<td> 1.5{1.15}
+<tr> <td> lshift <td> 1.2 <td> 1.75 <td> 2 <td>1.75{1.46}<td>5 <td> 2.35 <td>1.8{1.3}<td> 1.3 <td> 3.16 <td>3.33{2.7}<td> 1.27 <td>1.375[1.25]<td> 1.3 <td> 1.3 <td>1.17{0.6}<td> 4.5 <td> 3.25[2]<td> 2.25(1)<td> 2.33 <td> 2.25 <td> 4 <td> 2.15 <td> 2.5 <td> 17.5 <td> 3 <td> 1.75 <td> 1 <td> 3 <td>2.92(1.9}<td> 1.5{1.15}
+<tr> <td> rshift <td> 1.2 <td> 1.75 <td> 2 <td>1.75{1.46}<td>5 <td> 2.35 <td>1.8{1.3}<td> 1.3 <td> 3.16 <td>3.33{2.7}<td> 1.27 <td>1.375[1.25]<td> 1.3 <td> 1.3 <td>1.17{0.6}<td> 4.5 <td> 3.25{2}<td> 2.25(1)<td> 2.33 <td> 2.25 <td> 3.5 <td> 2.15 <td> 2.5 <td> 17.5 <td> 3 <td> 1.75 <td> 1 <td> 3 <td>2.92{1.9}<td> 1.5{1.15}
<tr> <td> lshiftc <td> * <td> * <td> * <td> * <td> 5.5 <td> 2.75 <td> 2{1.5}<td> 1.4 <td> 3.7 <td>4.15{3.5}<td> 1.5 <td> 1.75 <td> 1.45 <td> 1.42 <td> 1.3 <td> 5 <td>3.5{2.5}<td> 2.25 <td> 2.33 <td> 2.25 <td> 4 <td> 2.15 <td> 2.67 <td> 17 <td> 3.5 <td> * <td> 1.25 <td> 3.5 <td>3.53(2.5)<td> 1.75(1.4)
-<tr> <td> copyd <td> 0.75-1 <td> #2 <td> #2 <td>0.73{0.5}<td>1.75{0.5}<td>1 <td> 1[0.85]<td> 0.7 <td> 1.48 <td>2.8[2.3]<td>0.52-0.8<td>0.52-0.64<td> 0.52 <td> 0.51 <td> 0.5 <td>1.16-1.66<td> 1.1 <td> 0.75 <td> #1 <td> 1.13 <td> 1.9{1}<td> 1.09 <td> 2.5 <td> 17 <td> 2 <td> 1 <td> 0.5 <td>1.25-1.5<td> 1.25 <td> 0.52
-<tr> <td> copyi <td> 0.75-1 <td> #1 <td> #1.5 <td>0.73{0.5}<td>1.75{0.5}<td>1 <td> 1[0.85]<td> 0.7 <td> 1.48 <td>2.8[2.3]<td>0.52-0.8<td>0.52-0.64<td> 0.54 <td> 0.51 <td> 0.5 <td>1.16-1.66<td> 1.1 <td> 0.75 <td> #1 <td> 1 <td> 2{1} <td> 1.09 <td> 2.5 <td> 17 <td> 2 <td> 1 <td> 0.5 <td>1.25-1.5<td> 1.25 <td> 0.52
+<tr> <td> copyd <td> 0.75-1 <td> #2 <td> #2 <td>0.73{0.5}<td>1.75{0.5}<td>1 <td> 1[0.85]<td> 0.7 <td> 1.48 <td>2.8[2.3]<td>0.52-0.8<td>0.52-0.64<td> 0.52 <td> 0.51 <td>0.5[0.25]<td>1.16-1.66<td> 1.1 <td> 0.75 <td> #1 <td> 1.13 <td> 1.9{1}<td> 1.09 <td> 2.5 <td> 17 <td> 2 <td> 1 <td> 0.5 <td>1.25-1.5<td> 1.25 <td> 0.52
+<tr> <td> copyi <td> 0.75-1 <td> #1 <td> #1.5 <td>0.73{0.5}<td>1.75{0.5}<td>1 <td> 1[0.85]<td> 0.7 <td> 1.48 <td>2.8[2.3]<td>0.52-0.64<td>0.52-0.71<td>0.51-0.54<td>0.51<td>0.5[0.25]<td>1.16-1.61<td> 1.1 <td> 0.75 <td> #1 <td> 1 <td> 2{1} <td> 1.09 <td> 2.5 <td> 17 <td> 2 <td> 1 <td> 0.5 <td>1.25-1.5<td> 1.25 <td> 0.52
<tr> <td> tabselect <td> 1.33 <td>2.1-2.63<td>1.7-2.57<td>1.33-1.87<td>1.85-2.7<td> 1.5 <td>0.78-.85<td>0.8-1.25<td> 2.15<td>2.5-2.95<td>1.17-1.25<td>0.87-0.9<td>0.63-0.79<td> <td> <td> 2.5 <td> 1.75 <td> 2 <td> 2 <td> ? <td> 5 <td> 1.75 <td> 3 <td> 17 <td> 2.25? <td> 1.64 <td> #2.5 <td> 1.15 <td> 2.2 <td> 0.65
<tr bgcolor="#e8e8e8"><td> com <td> 1 <td> <td> <td> <td> <td> 1.25 <td>1.18[0.85]<td> 0.9 <td> 1.75 <td>2.8[2.3]<td> 1.05 <td>1.5[0.5]<td>1.25[0.5]<td> 1.25 <td> 1 <td> 2.75 <td> 2[1.1]<td> (0.75) <td> 1.25 <td> ? <td> 1.32 <td> 1.13 <td> <td> <td> <td> 1.5 <td> (0.5) <td> 1.75 <td> 1 <td> 0.65
<tr bgcolor="#e8e8e8"><td> and_n <td> {1.5} <td> <td> <td> <td> 3 <td> 1.5 <td> 1.5\2 <td> 1.65 <td> 2.67 <td> 2.75 <td> 2 <td> 2 <td> 1.5 <td> 1.5 <td> 1.5 <td> 3.75 <td> 3 <td> 1.14 <td> 2 <td> 2 <td> 2.5 <td> 1.75 <td> <td> <td> <td> (1.75) <td> 1 <td> 2.1 <td> 1.27 <td> 1.27
@@ -168,10 +169,10 @@
<tr bgcolor="#e8e8e8"><td> mod_1s_3p <td> <td> <td> <td> <td> <td> {3} <td> {3} <td> {5.5} <td> {8} <td>{16} <td> {5.41} <td> {4.5} <td> {3} <td> <td> <td> <td> {5} <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td> <td>
<tr bgcolor="#e8e8e8"><td> mod_1s_4p <td>4.75{4.25}<td> 4 <td> 4.5 <td> 3.4 <td> 8.75 <td>3{2.75} <td>3{2.75} <td> 5.7{5}<td> 7.67 <td> 15.75 <td> 5 <td>4[3.75]<td>3.25{2.5}<td> 3.05 <td> 2.6 <td> 23 <td>4.75{4.17}<td>[6.5] <td> 9 <td> 9 <td> 13 <td> 3.5 <td> <td> <td> 4 <td> 3 <td> (2.25) <td> <td> <td>
<tr bgcolor="#e8e8e8"><td> mod_34lsub1 <td> #1 <td> 1.25 <td> 1.25 <td> #1.9 <td> 2.33 <td> 0.67 <td> 0.67 <td> 1 <td> 1.125 <td> 3.2 <td> 1.25 <td> 1.15 <td> 0.93 <td> 0.93 <td> 0.82 <td> 2.45 <td> 1.25 <td> 0.87 <td> 1.5 <td> 1.32 <td> 2.35 <td> 1 <td> <td> <td> 1.67? <td> #1.67 <td> 1 <td> 1.33{1}<td>1.33{0.92}<td> 1.33{0.59}
-<tr> <td> gcd_1 <td> 5.31/b<td> [10/b] <td> [10/b] <td> 5.09/b<td> [8.9/b]<td> 5.21/b<td> 4.30/b<td> 5.00/b<td> 6.71/b<td> 13.5/b<td> 3.83/b<td> 5.17/b<td> 4.69/b<td> 4.54/b<td> 4/b <td> 8.77/b<td> 5.44/b<td> <td> 8.5 <td> ? <td> 10.1 <td> 7.6 <td> 5.00/b<td> 11.4/b <td> 6.0/b <td> 3.4/b <td> 6.35/b<td> 5.3/b <td> 3.5/b <td> 3.5/b
+<tr> <td> gcd_1 <td> 5.31/b<td> [10/b] <td> [10/b] <td> 5.09/b<td> [8.9/b]<td> 5.21/b<td> 4.30/b<td> 5.00/b<td> 6.71/b<td> 13.5/b<td> 3.83/b<td> 5.17/b<td> 4.69/b<td> 4.54/b<td> 4/b <td> 8.77/b<td> 5.44/b<td> <td> 8.5 <td> ? <td> 10.1 <td> 7.6 <td> 5.00/b<td> 11.4/b <td> 6.0/b <td> 3.4/b <td> 5.1/b <td> 5.3/b <td> 3.5/b <td> 3.5/b
<tr> <td> invert_limb <td> 41 <td> <td> <td> <td> <td> 48 <td> 48 <td> 63 <td> 64 <td>135 <td> 69 <td> 55 <td> 44 <td> 42 <td> 42 <td>130 <td> 78 <td> 32 <td> 86 <td> 86 <td>170 <td> 66 <td> <td> <td> ? <td> 71 <td> 56 <td> 43 <td> 41 <td> 41
<tr> <td> popcount <td> 5(4) <td> 3.9 <td> 4.25 <td> #4.6 <td> 5.5 <td> 6 <td> 1.125 <td> 4.4 <td> 6.1 <td> 8 <td> 3.67{3}<td> 1.25 <td> 1.05 <td> 1.05 <td> 1 <td> 10.75 <td> 6.5{5}<td> 1.125 <td> 2.25 <td> {2.16} <td> <td> 2 <td> <td> <td> 2.5 <td> #1.5 <td> 1 <td> 1.13 <td> 5.67 <td> 0.56
-<tr> <td> hamdist <td> 6(5) <td> {5.4} <td> {5.4} <td> 6.08 <td> 8 <td> 7 <td> 2{1.5}<td> 4.5 <td> 7.5 <td>14.3{10}<td> 8(4) <td> 2{1.5}<td> 2{1.5}<td> 2 <td> 1.64 <td> 17.5(12)<td> 10.4(6)<td>(1.5) <td> (3) <td> <td> <td> 2.87 <td> <td> <td> 3.5 <td> #2.4 <td> 1 <td> 1.89 <td> 6.44 <td> 0.95
+<tr> <td> hamdist <td> 6(5) <td> {5.4} <td> {5.4} <td> 6.08 <td> 8 <td> 7 <td> 2{1.5}<td> 4.5 <td> 7.5 <td>14.3{10}<td> 8(4) <td> 2{1.5}<td> 2{1.5}<td> 2 <td> 1.64 <td>17.5(12)<td> 10.4(6)<td> (1.5) <td> (3) <td> <td> <td> 2.87 <td> <td> <td> 3.5 <td> #2.4 <td> 1 <td> 1.89 <td> 6.44 <td> 0.95
<tbody> <!-- function k7 p4-2/32 p4-3/32 dothan atom k8 k10 bulldozer bobcat p4/64 core2 nehalem sandybridge ivybridge haswell atom nano ppc/32 ppc 970 pwr 5 pwr 6 pwr 7 us3 us-t1 us-t4 alpha itanic cor-a9 cor-a15 cor-a15 -->
<tr> <th> <th> AMD<br>K7<br>32 <th> Intel<br>Nor<br>32 <th> Intel<br>Pres<br>32 <th> Intel<br>Doth<br>32 <th> Intel<br>Atom<br>32 <th> AMD<br>K8<br>64 <th> AMD<br>K10<br>64 <th> AMD<br>Bulld<br>64 <th> AMD<br>Bobc<br>64 <th> Intel<br>Noc<br>64 <th> Intel<br>Core2<br>64 <th>Intel<br>NHM<br>64 <th>Intel<br>SBR<br>64 <th>Intel<br>IBR<br>64 <th>Intel<br>HWL<br>64 <th>Intel<br>Atom<br>64 <th>VIA<br>Nano<br>64 <th> PPC<br>74x7<br>32 <th> PPC<br>970<br>64 <th> IBM<br>PWR5<br>64 <th> IBM<br>PWR6<br>64 <th> IBM<br>PWR7<br>64 <th> Sun<br>US3<br>64 <th> Sun<br>T1<br>64 <th> Sun<br>T4<br>64 <th> Alpha<br>21264<br>64 <th> Itanium<br>2<br>64 <th> ARM<br>a9 neon<br>32 <th> ARM<br>a15<br>32 <th> ARM<br>a15 neon<br>32
</table>
@@ -278,7 +279,7 @@
<br><br>
-<font size="-4">Last modified: 2013-08-28 </font>
+<font size="-4">Last modified: 2013-09-24 </font>
<div id="footer-spacer"></div>
diff -r cbb337e68558 -r 70be7067dacc devel/index.html
--- a/devel/index.html Fri Aug 30 14:59:14 2013 +0200
+++ b/devel/index.html Thu Sep 26 14:17:29 2013 +0200
@@ -77,6 +77,9 @@
<p> <a href="arm.html">List of desirable ARM improvements</a>
</p>
+<p> <a href="x64-64.html">List of desirable X86-64 improvements</a>
+</p>
+
<p> <a href="sparc.html">List of desirable SPARC (T4-T5) improvements</a>
</p>
@@ -86,6 +89,54 @@
<hr>
+<h3> Basecase performance </h3>
+
+<p> We are working to make critical basecase functions perform near-optimally
+on interesting CPUs. The current status can be seen in the diagrams below.
+The diagrams are ordered pairwise per CPU, with linear scale to the left and
+log/log scale to the right. Measured values are in cycles.
+</p>
+
+<p> Well-formed functions should perform smoothly, where the order from slowest
+to fastest should be redc_1, mul_basecase, sqr_basecase, and mullo_basecase.
+In most cases where the order is not well-formed, assembly support is missing
+or inadequate.
+</p>
+
+<p>
+The diagrams are sorted fastest-CPU-first as measured by mul_basecase, except
+that Itanium and POWER7 are at the end. The diagrams' ranges are the same for
+all CPU types, to allow some vertical comparisons.
+</p>
+
+<table rules="groups">
+<tr><th>Intel Haswell lin/lin</th> <th>Intel Haswell log/log</th>
+<tr><td> <img src="hannah.png" border="0"> </td><td> <img src="hannah-loglog.png" border="0"> </td>
+<tr><th>AMD K10/Thuban lin/lin</th> <th>AMD K10/Thuban log/log</th>
+<tr><td> <img src="shell.png" border="0"> </td><td> <img src="shell-loglog.png" border="0"> </td>
+<tr><th>Intel Ivy bridge lin/lin</th> <th>Intel Ivy bridge log/log</th>
+<tr><td> <img src="joerg.png" border="0"> </td><td> <img src="joerg-loglog.png" border="0"> </td>
+<tr><th>Intel Sandy bridge lin/lin</th> <th>Intel Sandy bridge log/log</th>
+<tr><td> <img src="tom.png" border="0"> </td><td> <img src="tom-loglog.png" border="0"> </td>
+<tr><th>Intel Nehalem lin/lin</th> <th>Intel Nehalem log/log</th>
+<tr><td> <img src="bikodeb64.png" border="0"> </td><td> <img src="bikodeb64-loglog.png" border="0"> </td>
+<tr><th>Intel Conroe lin/lin</th> <th>Intel Conroe log/log</th>
+<tr><td> <img src="repentium.png" border="0"> </td><td> <img src="repentium-loglog.png" border="0"> </td>
+<tr><th>AMD Piledriver lin/lin</th> <th>AMD Piledriver log/log</th>
+<tr><td> <img src="pile.png" border="0"> </td><td> <img src="pile-loglog.png" border="0"> </td>
+<tr><th>AMD Bulldozer lin/lin</th> <th>AMD Bulldozer log/log</th>
+<tr><td> <img src="tutu.png" border="0"> </td><td> <img src="tutu-loglog.png" border="0"> </td>
+<tr><th>AMD Bobcat lin/lin</th> <th>AMD Bobcat log/log</th>
+<tr><td> <img src="bobcat.png" border="0"> </td><td> <img src="bobcat-loglog.png" border="0"> </td>
+<tr><th>Intel Atom lin/lin</th> <th>Intel Atom log/log</th>
+<tr><td> <img src="hehe.png" border="0"> </td><td> <img src="hehe-loglog.png" border="0"> </td>
+<tr><th>Intel Itanium2 lin/lin</th> <th>Intel Itanium2 log/log</th>
+<tr><td> <img src="olympic.png" border="0"> </td><td> <img src="olympic-loglog.png" border="0"> </td>
+<tr><th>IBM POWER7-smt4 lin/lin</th> <th>IBM POWER7-smt4 log/log</th>
+<tr><td> <img src="gcc110.png" border="0"> </td><td> <img src="gcc110-loglog.png" border="0"> </td>
+</table>
+
+
<h3> Division performance anomalies (partially fixed) </h3>
<p> These diagrams show performance for fixed quotient sizes, meaning that the
@@ -439,7 +490,7 @@
</div>
<div id="footer">
-<font size="-4">Last modified: 2013-06-11 </font>
+<font size="-4">Last modified: 2013-09-25 </font>
<table cellpadding=0 width="100%" bgcolor="#e8e8e8">
<tr>
<td align="center">
diff -r cbb337e68558 -r 70be7067dacc devel/repo-usage.html
--- a/devel/repo-usage.html Fri Aug 30 14:59:14 2013 +0200
+++ b/devel/repo-usage.html Thu Sep 26 14:17:29 2013 +0200
@@ -75,6 +75,10 @@
ignore the 4 lines of warnings from <code>libtoolize</code>.
</p>
+<p>Do <b>not</b> use <code>autoreconf</code>; it will overwrite
+<code>config.guess</code> which in turn will cause any builds to be awful.
+</p>
+
<p> Now you should be able to build GMP as usually, i.e., with
<blockquote>
<code>configure OPTIONS</code> <br>
@@ -85,7 +89,7 @@
<br><br>
-<font size="-4">Last modified: 2013-02-12 </font>
+<font size="-4">Last modified: 2013-09-26 </font>
<div id="footer-spacer"></div>
diff -r cbb337e68558 -r 70be7067dacc devel/sparc.html
--- a/devel/sparc.html Fri Aug 30 14:59:14 2013 +0200
+++ b/devel/sparc.html Thu Sep 26 14:17:29 2013 +0200
@@ -66,8 +66,8 @@
</p>
<p>
The T4/T5 are completely different, and are not at all bad GMP performers; they
-are merely 2-3 times slower than a concurrent PC (using GMP repo code for
-SPARC). They are just 2-issue and can perform just one 64-bit ld/st per cycle,
+are now not much slower than a concurrent PC (using GMP repo code for SPARC).
+These CPUs are just 2-issue and can perform just one 64-bit ld/st per cycle,
but they are out-of-order and have a fully pipelined integer multiply unit,
albeit with an extreme latency of 12 cycles. Unlike older SPARCs, they (and
T3) have an instruction umulxhi for producing the upper half of a 64 × 64
@@ -139,7 +139,7 @@
</div>
<div id="footer">
-<font size="-4">Last modified: 2013-05-29 </font>
+<font size="-4">Last modified: 2013-09-02 </font>
<table cellpadding=0 width="100%" bgcolor="#e8e8e8">
<tr>
<td align="center">
diff -r cbb337e68558 -r 70be7067dacc devel/testsystems.html
--- a/devel/testsystems.html Fri Aug 30 14:59:14 2013 +0200
+++ b/devel/testsystems.html Thu Sep 26 14:17:29 2013 +0200
@@ -66,7 +66,7 @@
<tr> <td> tutu <td> x86-64 <td> FX-4100 <td> Bulldozer Zambezi <td align="center"> 4 <td align="right"> 3600 <td align="right"> 8192 <td align="center"> Y <td> fbsd
<tr> <td> shell <td> x86-64 <td> Phenom II <td> K10 Thuban <td align="center"> 6 <td align="right"> 3200 <td align="right"> 16384 <td align="center"> Y <td> fbsd
<tr> <td> bobcat <td> x86-64 <td> E-350 <td> Zacate <td align="center"> 2 <td align="right"> 1600 <td align="right"> 3072 <td align="center"> N <td> fbsd
-<tr> <td> tiger <td> x86-64 <td> Phenom 9750 <td> K10 Barcelona <td align="center"> 4 <td align="right"> 2400 <td align="right"> 8192 <td align="center"> Y <td> gnu/linux
+<tr> <td> tiger <td> x86-64 <td> Phenom 9750 <td> K10 Barcelona <td align="center"> 4 <td align="right"> 2400 <td align="right"> 8192 <td align="center"> Y <td> gnu/linux <td> offline
<tr> <td> panther <td> x86-64 <td> Athlon 64 X2 4800+ <td> K8 Brisbane <td align="center"> 2 <td align="right"> 2500 <td align="right"> 4096 <td align="center"> Y <td> fbsd
<tr> <td> hehe <td> x86-64 <td> Atom 330 <td> Diamondville <td align="center"> 2 <td align="right"> 1600 <td align="right"> 2048 <td align="center"> N <td> fbsd
<tr> <td> element <td> x86-64 <td> Pentium4-4 (Xeon) <td> Nocona <td align="center"> 2 <td align="right"> 3400 <td align="right"> 8192 <td align="center"> Y <td> fbsd
@@ -83,7 +83,7 @@
<tr> <td> ev56 <td> alpha <td> 21164A <td> EV56 <td align="center"> 1 <td align="right"> 600 <td align="right"> 384 <td align="center"> Y <td> fbsd <td>
<tbody>
<tr> <td> titanic <td> ia-64 <td> Itanium 2 <td> Mckinley <td align="center"> 2 <td align="right"> 900 <td align="right"> 2048 <td align="center"> Y <td> gnu/linux <td> disk crashed (ILO at 10.0.0.220:23)
-<tr> <td> olympic <td> ia-64 <td> Itanium 2 <td> Mckinley <td align="center"> 2 <td align="right"> 900 <td align="right"> 2048 <td align="center"> Y <td> gnu/linux <td> not always powered-on (ilo at 10.0.0.221:23)
+<tr> <td> olympic <td> ia-64 <td> Itanium 2 <td> Mckinley <td align="center"> 2 <td align="right"> 900 <td align="right"> 2048 <td align="center"> Y <td> gnu/linux <td> not always powered-on (ILO at 10.0.0.221:23)
<tbody>
<tr> <td> g5 <td> ppc64 <td> PPC-970 <td> <td align="center"> 2 <td align="right"> 1800 <td align="right"> 2048 <td align="center"> N <td> macos/darwin
<tr> <td> spigg <td> ppc32 <td> PPC-7447 <td> <td align="center"> 1 <td align="right"> 1416 <td align="right"> 512 <td align="center"> N <td> gnu/linux
@@ -111,8 +111,8 @@
<tr> <td> biko{os}32 <td> x86-32 <td> biko <td> xen <td align="center"> 1 <td align="right"> varying <td align="right"> 0.5-2<td> see hostname<sup>1</sup> <td>
<tr> <td> biko{os}64 <td> x86-64 <td> biko <td> xen <td align="center"> 1 <td align="right"> varying <td align="right"> 0.5-2<td> see hostname<sup>1</sup> <td>
<tr> <td> leg <td> arm64 <td> pile <td> foundation_v8 <td align="center"> 1 <td align="right"> 4096 <td align="right"> 85 <td> gnu/linux <td> system clock stalls when system is loaded
-<tr> <td> hwl <td> x86-64 <td> pile <td> qemu <td align="center"> 1 <td align="right"> 512 <td align="right"> 15 <td> fbsd <td> offline; supports HNI (Haswell New Instructions)
-<tr> <td> hwl-deb <td> x86-64 <td> pile <td> qemu <td align="center"> 1 <td align="right"> 512 <td align="right"> 30 <td> gnu/linux <td> offline; supports HNI (Haswell New Instructions)
+<tr> <td> hwl <td> x86-64 <td> pile <td> qemu <td align="center"> 1 <td align="right"> 512 <td align="right"> 15 <td> fbsd <td> offline, use hannah
+<tr> <td> hwl-deb <td> x86-64 <td> pile <td> qemu <td align="center"> 1 <td align="right"> 512 <td align="right"> 30 <td> gnu/linux <td> offline, use hannah
<tr> <td> kurt <td> x86-32 <td> tutu <td> kvm <td align="center"> 1 <td align="right"> 768 <td align="right"> 5 <td> gnu/hurd <td> offline
<tr> <td> mips64eb <td> mips64eb <td> pile <td> qemu <td align="center"> 1 <td align="right"> 256 <td align="right"> 90 <td> gnu/linux <td> offline
<tr> <td> mips64el <td> mips64el <td> pile <td> qemu <td align="center"> 1 <td align="right"> 256 <td align="right"> 80 <td> gnu/linux <td> offline
@@ -156,7 +156,7 @@
</div>
<div id="footer">
-<font size="-4">Last modified: 2013-08-30 </font>
+<font size="-4">Last modified: 2013-09-02 </font>
<table cellpadding=0 width="100%" bgcolor="#e8e8e8">
<tr>
<td align="center">
diff -r cbb337e68558 -r 70be7067dacc devel/x64-64.html
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/devel/x64-64.html Thu Sep 26 14:17:29 2013 +0200
@@ -0,0 +1,181 @@
+<!DOCTYPE HTML>
+<html>
+<head>
+ <title>GMP developers' X86-64 corner</title>
+ <link rel="shortcut icon" href="favicon.ico">
+ <link rel="stylesheet" href="new.css">
+ <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
+ <style type="text/css"> td {padding-left:4pt; padding-right:4pt;}</style>
+ <style type="text/css"> th {padding-left:4pt; padding-right:2pt;}</style>
+</head>
+
+<body>
+
+<div id="top">
+<table width="100%" bgcolor="#e8e8e8">
+ <tr>
+ <td align="left">
+ <svg width="180px" height="60px" version="1.1"
+ viewBox="0 0 1500 500"
+ xmlns="http://www.w3.org/2000/svg">
+ <rect x="0" y="0" width="1500" height="540" fill="#e8e8e8" />
+ <text x="0" y="440" fill="#e00000" font-size="540" font-family="arial" font-weight="bold">
+ GMP
+ </text>
+ <text x="50" y="500" font-size="70" font-family="Verdana">
+ «Arithmetic without limitations»
+ </text>
+ </svg>
+ </td>
+ <td align="center">
+ <font size="+2">GMP developers' X86-64 corner</font>
+ </td>
+ </tr>
+</table>
+</div>
+
+<div id="container">
+ <div id="top-spacer"></div>
+
+<br><br>
+
+
+<hr>
+
+<h3> Core pipeline overview </h3>
+
+<blockquote>
+<table rules="groups" frame="void" cellpadding=4px>
+ <colgroup><col>
+ <thead>
+ <tr> <td> <th> Conroe<br>Penryn <th> Nehalem<br>Westmere <th> Sandy bridge <th> Ivy bridge <th> Haswell
+ <tbody>
+ <tr> <td> issue width <td> 3 <td> 3 <td> 3 <td> 3 <td> 4 </tr>
+ <tr> <td> SIMD exec width <td> 128 <td> 128 <td> 128 <td> 128 <td> 256 </tr>
+</table>
+</blockquote>
More information about the gmp-commit
mailing list