[Gmp-commit] /var/hg/gmp: 6 new changesets
mercurial at gmplib.org
mercurial at gmplib.org
Sun Dec 10 00:26:56 UTC 2017
details: /var/hg/gmp/rev/bbcc3d708081
changeset: 17499:bbcc3d708081
user: Torbjorn Granlund <tg at gmplib.org>
date: Sun Dec 10 01:18:21 2017 +0100
description:
Fix comment typo.
details: /var/hg/gmp/rev/b488a90b9744
changeset: 17500:b488a90b9744
user: Torbjorn Granlund <tg at gmplib.org>
date: Sun Dec 10 01:21:44 2017 +0100
description:
New grabber file.
details: /var/hg/gmp/rev/70de4c284c68
changeset: 17501:70de4c284c68
user: Torbjorn Granlund <tg at gmplib.org>
date: Sun Dec 10 01:22:31 2017 +0100
description:
New grabber file.
details: /var/hg/gmp/rev/d2d0e7d5209e
changeset: 17502:d2d0e7d5209e
user: Torbjorn Granlund <tg at gmplib.org>
date: Sun Dec 10 01:24:27 2017 +0100
description:
Add a c/l number.
details: /var/hg/gmp/rev/a1296dc2838f
changeset: 17503:a1296dc2838f
user: Torbjorn Granlund <tg at gmplib.org>
date: Sun Dec 10 01:25:24 2017 +0100
description:
Update c/l numbers.
details: /var/hg/gmp/rev/ad1b7e6728be
changeset: 17504:ad1b7e6728be
user: Torbjorn Granlund <tg at gmplib.org>
date: Sun Dec 10 01:26:18 2017 +0100
description:
Update c/l numbers.
diffstat:
mpn/x86_64/bd1/aors_n.asm | 37 +++++++++++++++++++++++++++++++++++++
mpn/x86_64/bd4/aorrlsh_n.asm | 38 ++++++++++++++++++++++++++++++++++++++
mpn/x86_64/coreihwl/aors_n.asm | 6 +++---
mpn/x86_64/sqr_diag_addlsh1.asm | 4 ++--
mpn/x86_64/zen/aorrlsh_n.asm | 2 +-
mpn/x86_64/zen/sqr_basecase.asm | 2 +-
6 files changed, 82 insertions(+), 7 deletions(-)
diffs (141 lines):
diff -r 20cf1131dc94 -r ad1b7e6728be mpn/x86_64/bd1/aors_n.asm
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/mpn/x86_64/bd1/aors_n.asm Sun Dec 10 01:26:18 2017 +0100
@@ -0,0 +1,37 @@
+dnl X86-64 mpn_add_n, mpn_sub_n, optimised for Intel Silvermont.
+
+dnl Copyright 2017 Free Software Foundation, Inc.
+
+dnl This file is part of the GNU MP Library.
+dnl
+dnl The GNU MP Library is free software; you can redistribute it and/or modify
+dnl it under the terms of either:
+dnl
+dnl * the GNU Lesser General Public License as published by the Free
+dnl Software Foundation; either version 3 of the License, or (at your
+dnl option) any later version.
+dnl
+dnl or
+dnl
+dnl * the GNU General Public License as published by the Free Software
+dnl Foundation; either version 2 of the License, or (at your option) any
+dnl later version.
+dnl
+dnl or both in parallel, as here.
+dnl
+dnl The GNU MP Library is distributed in the hope that it will be useful, but
+dnl WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+dnl or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+dnl for more details.
+dnl
+dnl You should have received copies of the GNU General Public License and the
+dnl GNU Lesser General Public License along with the GNU MP Library. If not,
+dnl see https://www.gnu.org/licenses/.
+
+include(`../config.m4')
+
+ABI_SUPPORT(DOS64)
+ABI_SUPPORT(STD64)
+
+MULFUNC_PROLOGUE(mpn_add_n mpn_add_nc mpn_sub_n mpn_sub_nc)
+include_mpn(`x86_64/coreihwl/aors_n.asm')
diff -r 20cf1131dc94 -r ad1b7e6728be mpn/x86_64/bd4/aorrlsh_n.asm
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/mpn/x86_64/bd4/aorrlsh_n.asm Sun Dec 10 01:26:18 2017 +0100
@@ -0,0 +1,38 @@
+dnl X86-64 mpn_addlsh_n and mpn_rsblsh_n.
+
+dnl Copyright 2017 Free Software Foundation, Inc.
+
+dnl This file is part of the GNU MP Library.
+dnl
+dnl The GNU MP Library is free software; you can redistribute it and/or modify
+dnl it under the terms of either:
+dnl
+dnl * the GNU Lesser General Public License as published by the Free
+dnl Software Foundation; either version 3 of the License, or (at your
+dnl option) any later version.
+dnl
+dnl or
+dnl
+dnl * the GNU General Public License as published by the Free Software
+dnl Foundation; either version 2 of the License, or (at your option) any
+dnl later version.
+dnl
+dnl or both in parallel, as here.
+dnl
+dnl The GNU MP Library is distributed in the hope that it will be useful, but
+dnl WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+dnl or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+dnl for more details.
+dnl
+dnl You should have received copies of the GNU General Public License and the
+dnl GNU Lesser General Public License along with the GNU MP Library. If not,
+dnl see https://www.gnu.org/licenses/.
+
+
+include(`../config.m4')
+
+ABI_SUPPORT(DOS64)
+ABI_SUPPORT(STD64)
+
+MULFUNC_PROLOGUE(mpn_addlsh_n mpn_rsblsh_n)
+include_mpn(`x86_64/zen/aorrlsh_n.asm')
diff -r 20cf1131dc94 -r ad1b7e6728be mpn/x86_64/coreihwl/aors_n.asm
--- a/mpn/x86_64/coreihwl/aors_n.asm Thu Aug 31 01:00:02 2017 +0200
+++ b/mpn/x86_64/coreihwl/aors_n.asm Sun Dec 10 01:26:18 2017 +0100
@@ -33,10 +33,10 @@
C cycles/limb
C AMD K8,K9
C AMD K10
-C AMD bd1
-C AMD bd2
+C AMD bd1 1.5 with fluctuations
+C AMD bd2 1.5 with fluctuations
C AMD bd3
-C AMD bd4
+C AMD bd4 1.6
C AMD zen
C AMD bt1
C AMD bt2
diff -r 20cf1131dc94 -r ad1b7e6728be mpn/x86_64/sqr_diag_addlsh1.asm
--- a/mpn/x86_64/sqr_diag_addlsh1.asm Thu Aug 31 01:00:02 2017 +0200
+++ b/mpn/x86_64/sqr_diag_addlsh1.asm Sun Dec 10 01:26:18 2017 +0100
@@ -40,11 +40,11 @@
C AMD steam ?
C AMD bobcat 4
C AMD jaguar ?
-C Intel P4 ?
+C Intel P4 11.5
C Intel core 4
C Intel NHM 3.6
C Intel SBR 3.15
-C Intel IBR 3.2
+C Intel IBR 3.0
C Intel HWL 2.6
C Intel BWL ?
C Intel atom 14
diff -r 20cf1131dc94 -r ad1b7e6728be mpn/x86_64/zen/aorrlsh_n.asm
--- a/mpn/x86_64/zen/aorrlsh_n.asm Thu Aug 31 01:00:02 2017 +0200
+++ b/mpn/x86_64/zen/aorrlsh_n.asm Sun Dec 10 01:26:18 2017 +0100
@@ -36,7 +36,7 @@
C AMD bd1 n/a
C AMD bd2 n/a
C AMD bd3 n/a
-C AMD bd4 ?
+C AMD bd4 2.31
C AMD zen 1.69
C AMD bt1 n/a
C AMD bt2 n/a
diff -r 20cf1131dc94 -r ad1b7e6728be mpn/x86_64/zen/sqr_basecase.asm
--- a/mpn/x86_64/zen/sqr_basecase.asm Thu Aug 31 01:00:02 2017 +0200
+++ b/mpn/x86_64/zen/sqr_basecase.asm Sun Dec 10 01:26:18 2017 +0100
@@ -37,7 +37,7 @@
C * Update un just once in the outer loop.
C
C * Perhaps keep un and n pre-multiplied by 8, thus suppressing ",8" from
-C loads and stores. At least in some cases, the non-scaped form is faster.
+C loads and stores. At least in some cases, the non-scaled form is faster.
C
C * Optimise xit3 code, e.g., using shrx and sarx like in the main loop.
C
More information about the gmp-commit
mailing list