[Gmp-commit] /var/hg/gmp: 8 new changesets
mercurial at gmplib.org
mercurial at gmplib.org
Wed Nov 7 08:52:39 UTC 2018
details: /var/hg/gmp/rev/aa607adf0feb
changeset: 17658:aa607adf0feb
user: Torbjorn Granlund <tg at gmplib.org>
date: Wed Nov 07 09:38:44 2018 +0100
description:
Provide some arm a12/a17 optimised files.
details: /var/hg/gmp/rev/8de02962ea15
changeset: 17659:8de02962ea15
user: Torbjorn Granlund <tg at gmplib.org>
date: Wed Nov 07 09:40:46 2018 +0100
description:
(ASM_START): Provide local definition.
* mpn/arm/arm-defs.m4 (ASM_START): Provide local definition.
details: /var/hg/gmp/rev/879c623b6bf0
changeset: 17660:879c623b6bf0
user: Torbjorn Granlund <tg at gmplib.org>
date: Wed Nov 07 09:44:48 2018 +0100
description:
(arm): Support a12 and a17.
* configure.ac (arm): Support a12 and a17.
details: /var/hg/gmp/rev/01807f5e2b94
changeset: 17661:01807f5e2b94
user: Torbjorn Granlund <tg at gmplib.org>
date: Wed Nov 07 09:45:37 2018 +0100
description:
Generalise arm matching.
* config.sub: Generalise arm matching.
details: /var/hg/gmp/rev/40b5cd79ecae
changeset: 17662:40b5cd79ecae
user: Torbjorn Granlund <tg at gmplib.org>
date: Wed Nov 07 09:46:40 2018 +0100
description:
Recognise additional arm CPUs.
* config.guess: Recognise additional arm CPUs.
details: /var/hg/gmp/rev/e915ffa9c03d
changeset: 17663:e915ffa9c03d
user: Torbjorn Granlund <tg at gmplib.org>
date: Wed Nov 07 09:49:28 2018 +0100
description:
Provide more c/l numbers.
details: /var/hg/gmp/rev/e3befc40f055
changeset: 17664:e3befc40f055
user: Torbjorn Granlund <tg at gmplib.org>
date: Wed Nov 07 09:50:39 2018 +0100
description:
Fix a comment typo.
details: /var/hg/gmp/rev/78b9d443b0e9
changeset: 17665:78b9d443b0e9
user: Torbjorn Granlund <tg at gmplib.org>
date: Wed Nov 07 09:52:36 2018 +0100
description:
ChangeLog
diffstat:
ChangeLog | 29 ++++++
config.guess | 8 +-
config.sub | 4 +-
configure.ac | 11 ++
mpn/arm/arm-defs.m4 | 5 +-
mpn/arm/mod_34lsub1.asm | 7 +-
mpn/arm/v5/gcd_1.asm | 9 +-
mpn/arm/v6/addmul_2.asm | 7 +-
mpn/arm/v6/addmul_3.asm | 7 +-
mpn/arm/v6/mul_2.asm | 7 +-
mpn/arm/v6t2/gcd_1.asm | 9 +-
mpn/arm/v7a/cora17/addmul_1.asm | 34 +++++++
mpn/arm/v7a/cora17/gmp-mparam.h | 175 +++++++++++++++++++++++++++++++++++++
mpn/arm/v7a/cora17/mod_34lsub1.asm | 121 +++++++++++++++++++++++++
mpn/arm/v7a/cora17/mul_1.asm | 34 +++++++
mpn/arm/v7a/cora17/submul_1.asm | 34 +++++++
mpn/generic/div_qr_1.c | 2 +-
17 files changed, 484 insertions(+), 19 deletions(-)
diffs (truncated from 664 to 300 lines):
diff -r 5ce20b738283 -r 78b9d443b0e9 ChangeLog
--- a/ChangeLog Thu Oct 18 20:01:02 2018 +0200
+++ b/ChangeLog Wed Nov 07 09:52:36 2018 +0100
@@ -1,3 +1,32 @@
+2018-11-07 Torbjörn Granlund <tg at gmplib.org>
+
+ * configure.ac (arm): Support a12 and a17.
+ * config.sub: Generalise arm matching.
+ * config.guess: Recognise additional arm CPUs.
+
+ * mpn/arm/arm-defs.m4 (ASM_START): Provide local definition.
+
+2018-10-30 Torbjörn Granlund <tg at gmplib.org>
+
+ * mpn/arm/v7a/cora17/mod_34lsub1.asm: New file.
+ * mpn/arm/v7a/cora17/gmp-mparam.h: New file.
+ * mpn/arm/v7a/cora17/mul_1.asm: New grabber file.
+ * mpn/arm/v7a/cora17/addmul_1.asm: Likewise.
+ * mpn/arm/v7a/cora17/submul_1.asm: Likewise.
+
+2018-07-03 Torbjörn Granlund <tg at gmplib.org>
+
+ * mpn/x86_64/lshift.asm: Remove cnt = 1 special code.
+
+ * mpn/x86_64/silvermont/popcount.asm: Add missing ABI_SUPPORT decls.
+ * mpn/x86_64/silvermont/hamdist.asm: Likewise.
+ * mpn/x86_64/zen/mul_1.asm: Likewise.
+
+ * mpn/x86_64/fastsse/lshift.asm: Support DOS64.
+ * mpn/x86_64/fastsse/lshiftc.asm: Likewise.
+
+ * mpn/x86_64/pentium4/gmp-mparam.h: Retune.
+
2018-07-01 Torbjörn Granlund <tg at gmplib.org>
* lshift.asm: Replace with grabber file.
diff -r 5ce20b738283 -r 78b9d443b0e9 config.guess
--- a/config.guess Thu Oct 18 20:01:02 2018 +0200
+++ b/config.guess Wed Nov 07 09:52:36 2018 +0100
@@ -204,14 +204,20 @@
0xc08) exact_cpu="armcortexa8";; # v7a
0xc09) exact_cpu="armcortexa9";; # v7a
0xc0f) exact_cpu="armcortexa15";; # v7a
+ 0xc0d) exact_cpu="armcortexa12";; # v7a
+ 0xc0e) exact_cpu="armcortexa17";; # v7a
0xc14) exact_cpu="armcortexr4";; # v7r
0xc15) exact_cpu="armcortexr5";; # v7r
0xc23) exact_cpu="armcortexm3";; # v7m
- 0xd04) exact_cpu="armcortexa35";; # v8-32
+ 0xd04) exact_cpu="armcortexa35";; # v8
0xd03) exact_cpu="armcortexa53";; # v8
+ 0xd05) exact_cpu="armcortexa55";; # v8.2
0xd07) exact_cpu="armcortexa57";; # v8
0xd08) exact_cpu="armcortexa72";; # v8
+ 0xd09) exact_cpu="armcortexa73";; # v8
+ 0xd0a) exact_cpu="armcortexa75";; # v8.2
+ 0xd0b) exact_cpu="armcortexa76";; # v8.3
*) exact_cpu=$guess_cpu;;
esac
fi
diff -r 5ce20b738283 -r 78b9d443b0e9 config.sub
--- a/config.sub Thu Oct 18 20:01:02 2018 +0200
+++ b/config.sub Wed Nov 07 09:52:36 2018 +0100
@@ -129,8 +129,8 @@
armsa1 | armxscale | arm9tdmi | arm9te | \
arm10* | arm11mpcore | armsa1 | arm1136 | arm1156 | arm1176 | \
-armcortexa5 | armcortexa7 | armcortexa8 | armcortexa9 | armcortexa15 | \
-armcortexr4 | armcortexr5 | armcortexm3 | arm*neon | xgene1 | exynosm1 | thunderx)
+armcortex[arm][0-9] | armcortex[arm][0-9][0-9] | \
+arm*neon | xgene1 | exynosm1 | thunderx)
test_cpu="arm";;
*)
diff -r 5ce20b738283 -r 78b9d443b0e9 configure.ac
--- a/configure.ac Thu Oct 18 20:01:02 2018 +0200
+++ b/configure.ac Wed Nov 07 09:52:36 2018 +0100
@@ -691,6 +691,17 @@
gcc_cflags_neon="-mfpu=neon"
gcc_cflags_tune="-mtune=cortex-a15 -mtune=cortex-a9"
;;
+ armcortexa12 | armcortexa17)
+ path="arm/v7a/cora17 arm/v7a/cora15 arm/v6t2 arm/v6 arm/v5 arm"
+ gcc_cflags_arch="-march=armv7-a"
+ gcc_cflags_tune="-mtune=cortex-a15 -mtune=cortex-a9"
+ ;;
+ armcortexa12neon | armcortexa17neon)
+ path="arm/v7a/cora17/neon arm/v7a/cora15/neon arm/neon arm/v7a/cora17 arm/v7a/cora15 arm/v6t2 arm/v6 arm/v5 arm"
+ gcc_cflags_arch="-march=armv7-a"
+ gcc_cflags_neon="-mfpu=neon"
+ gcc_cflags_tune="-mtune=cortex-a15 -mtune=cortex-a9"
+ ;;
armcortexa53 | armcortexa53neon)
abilist="64 32"
path="arm/neon arm/v7a/cora9 arm/v6t2 arm/v6 arm/v5 arm"
diff -r 5ce20b738283 -r 78b9d443b0e9 mpn/arm/arm-defs.m4
--- a/mpn/arm/arm-defs.m4 Thu Oct 18 20:01:02 2018 +0200
+++ b/mpn/arm/arm-defs.m4 Wed Nov 07 09:52:36 2018 +0100
@@ -2,7 +2,7 @@
dnl m4 macros for ARM assembler.
-dnl Copyright 2001, 2012-2016 Free Software Foundation, Inc.
+dnl Copyright 2001, 2012-2016, 2018 Free Software Foundation, Inc.
dnl This file is part of the GNU MP Library.
dnl
@@ -36,6 +36,9 @@
changecom(@&*$)
+define(`ASM_START',`
+ifelse($1,`neon',`.fpu neon',
+m4_assert_numargs(0))')
dnl APCS register names.
diff -r 5ce20b738283 -r 78b9d443b0e9 mpn/arm/mod_34lsub1.asm
--- a/mpn/arm/mod_34lsub1.asm Thu Oct 18 20:01:02 2018 +0200
+++ b/mpn/arm/mod_34lsub1.asm Wed Nov 07 09:52:36 2018 +0100
@@ -33,10 +33,13 @@
C cycles/limb
C StrongARM ?
C XScale ?
-C Cortex-A7 ?
-C Cortex-A8 ?
+C Cortex-A5 2.67
+C Cortex-A7 2.35
+C Cortex-A8 2.0
C Cortex-A9 1.33
C Cortex-A15 1.33
+C Cortex-A17 3.34
+C Cortex-A53 2.0
define(`ap', r0)
define(`n', r1)
diff -r 5ce20b738283 -r 78b9d443b0e9 mpn/arm/v5/gcd_1.asm
--- a/mpn/arm/v5/gcd_1.asm Thu Oct 18 20:01:02 2018 +0200
+++ b/mpn/arm/v5/gcd_1.asm Wed Nov 07 09:52:36 2018 +0100
@@ -36,10 +36,13 @@
C cycles/bit (approx)
C StrongARM -
C XScale ?
-C Cortex-A7 ?
-C Cortex-A8 ?
+C Cortex-A5 6.45
+C Cortex-A7 6.41
+C Cortex-A8 5.0
C Cortex-A9 5.9
-C Cortex-A15 ?
+C Cortex-A15 4.40
+C Cortex-A17 5.68
+C Cortex-A53 4.37
C Numbers measured with: speed -CD -s8-32 -t24 mpn_gcd_1
C TODO
diff -r 5ce20b738283 -r 78b9d443b0e9 mpn/arm/v6/addmul_2.asm
--- a/mpn/arm/v6/addmul_2.asm Thu Oct 18 20:01:02 2018 +0200
+++ b/mpn/arm/v6/addmul_2.asm Wed Nov 07 09:52:36 2018 +0100
@@ -36,10 +36,13 @@
C StrongARM: -
C XScale -
C ARM11 4.68
-C Cortex-A7 3.625
-C Cortex-A8 4
+C Cortex-A5 3.63
+C Cortex-A7 3.65
+C Cortex-A8 4.0
C Cortex-A9 2.25
C Cortex-A15 2.5
+C Cortex-A17 2.13
+C Cortex-A53 3.5
define(`rp',`r0')
define(`up',`r1')
diff -r 5ce20b738283 -r 78b9d443b0e9 mpn/arm/v6/addmul_3.asm
--- a/mpn/arm/v6/addmul_3.asm Thu Oct 18 20:01:02 2018 +0200
+++ b/mpn/arm/v6/addmul_3.asm Wed Nov 07 09:52:36 2018 +0100
@@ -36,10 +36,13 @@
C StrongARM: -
C XScale -
C ARM11 4.33
-C Cortex-A7 3.23
-C Cortex-A8 3.19
+C Cortex-A5 3.28
+C Cortex-A7 3.25
+C Cortex-A8 3.17
C Cortex-A9 2.125
C Cortex-A15 2
+C Cortex-A17 2.11
+C Cortex-A53 4.18
C TODO
C * Use a fast path for n <= KARATSUBA_MUL_THRESHOLD using a jump table,
diff -r 5ce20b738283 -r 78b9d443b0e9 mpn/arm/v6/mul_2.asm
--- a/mpn/arm/v6/mul_2.asm Thu Oct 18 20:01:02 2018 +0200
+++ b/mpn/arm/v6/mul_2.asm Wed Nov 07 09:52:36 2018 +0100
@@ -36,10 +36,13 @@
C StrongARM: -
C XScale -
C ARM11 5.25
-C Cortex-A7 3.13
-C Cortex-A8 5
+C Cortex-A5 3.63
+C Cortex-A7 3.15
+C Cortex-A8 5.0
C Cortex-A9 2.25
C Cortex-A15 2.5
+C Cortex-A17 2.13
+C Cortex-A53 3.5
C TODO
C * This is a trivial edit of the addmul_2 code. Check for simplifications,
diff -r 5ce20b738283 -r 78b9d443b0e9 mpn/arm/v6t2/gcd_1.asm
--- a/mpn/arm/v6t2/gcd_1.asm Thu Oct 18 20:01:02 2018 +0200
+++ b/mpn/arm/v6t2/gcd_1.asm Wed Nov 07 09:52:36 2018 +0100
@@ -36,10 +36,13 @@
C cycles/bit (approx)
C StrongARM -
C XScale -
-C Cortex-A7 ?
-C Cortex-A8 ?
+C Cortex-A5 5.75
+C Cortex-A7 6.38
+C Cortex-A8 5.0
C Cortex-A9 5.3
-C Cortex-A15 3.5
+C Cortex-A15 2.92
+C Cortex-A17 5.63
+C Cortex-A53 4.25
C Numbers measured with: speed -CD -s8-32 -t24 mpn_gcd_1
C TODO
diff -r 5ce20b738283 -r 78b9d443b0e9 mpn/arm/v7a/cora17/addmul_1.asm
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/mpn/arm/v7a/cora17/addmul_1.asm Wed Nov 07 09:52:36 2018 +0100
@@ -0,0 +1,34 @@
+dnl ARM mpn_addmul_1
+
+dnl Copyright 2018 Free Software Foundation, Inc.
+
+dnl This file is part of the GNU MP Library.
+dnl
+dnl The GNU MP Library is free software; you can redistribute it and/or modify
+dnl it under the terms of either:
+dnl
+dnl * the GNU Lesser General Public License as published by the Free
+dnl Software Foundation; either version 3 of the License, or (at your
+dnl option) any later version.
+dnl
+dnl or
+dnl
+dnl * the GNU General Public License as published by the Free Software
+dnl Foundation; either version 2 of the License, or (at your option) any
+dnl later version.
+dnl
+dnl or both in parallel, as here.
+dnl
+dnl The GNU MP Library is distributed in the hope that it will be useful, but
+dnl WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+dnl or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+dnl for more details.
+dnl
+dnl You should have received copies of the GNU General Public License and the
+dnl GNU Lesser General Public License along with the GNU MP Library. If not,
+dnl see https://www.gnu.org/licenses/.
+
+include(`../config.m4')
+
+MULFUNC_PROLOGUE(mpn_addmul_1)
+include_mpn(`arm/v6/addmul_1.asm')
diff -r 5ce20b738283 -r 78b9d443b0e9 mpn/arm/v7a/cora17/gmp-mparam.h
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/mpn/arm/v7a/cora17/gmp-mparam.h Wed Nov 07 09:52:36 2018 +0100
@@ -0,0 +1,175 @@
+/* gmp-mparam.h -- Compiler/machine parameter header file.
+
+Copyright 2018 Free Software Foundation, Inc.
+
+This file is part of the GNU MP Library.
+
+The GNU MP Library is free software; you can redistribute it and/or modify
+it under the terms of either:
+
+ * the GNU Lesser General Public License as published by the Free
+ Software Foundation; either version 3 of the License, or (at your
+ option) any later version.
+
+or
+
+ * the GNU General Public License as published by the Free Software
+ Foundation; either version 2 of the License, or (at your option) any
+ later version.
+
+or both in parallel, as here.
+
+The GNU MP Library is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
More information about the gmp-commit
mailing list