[Gmp-commit] /var/hg/gmp: 4 new changesets
mercurial at gmplib.org
mercurial at gmplib.org
Mon Oct 3 14:17:13 CEST 2011
details: /var/hg/gmp/rev/25e66c379e14
changeset: 14249:25e66c379e14
user: Niels M?ller <nisse at lysator.liu.se>
date: Mon Oct 03 12:32:10 2011 +0200
description:
mulmid C implementation.
details: /var/hg/gmp/rev/d548e6eea8fe
changeset: 14250:d548e6eea8fe
user: Niels M?ller <nisse at lysator.liu.se>
date: Mon Oct 03 13:07:18 2011 +0200
description:
mulmid testing.
details: /var/hg/gmp/rev/27d86b7aa4a9
changeset: 14251:27d86b7aa4a9
user: Niels M?ller <nisse at lysator.liu.se>
date: Mon Oct 03 13:30:35 2011 +0200
description:
Tuning of mulmid.
details: /var/hg/gmp/rev/799b8e61a84e
changeset: 14252:799b8e61a84e
user: Niels M?ller <nisse at lysator.liu.se>
date: Mon Oct 03 13:52:31 2011 +0200
description:
mulmid x86_64 assembly.
diffstat:
ChangeLog | 85 ++++++
configure.in | 8 +
gmp-impl.h | 45 +++
mpn/asm-defs.m4 | 19 +
mpn/generic/add_err1_n.c | 90 ++++++
mpn/generic/add_err2_n.c | 106 +++++++
mpn/generic/add_err3_n.c | 121 ++++++++
mpn/generic/mulmid.c | 244 +++++++++++++++++
mpn/generic/mulmid_basecase.c | 72 +++++
mpn/generic/mulmid_n.c | 51 +++
mpn/generic/sub_err1_n.c | 90 ++++++
mpn/generic/sub_err2_n.c | 106 +++++++
mpn/generic/sub_err3_n.c | 121 ++++++++
mpn/generic/toom42_mulmid.c | 227 ++++++++++++++++
mpn/x86_64/aors_err1_n.asm | 214 +++++++++++++++
mpn/x86_64/aors_err2_n.asm | 161 +++++++++++
mpn/x86_64/aors_err3_n.asm | 145 ++++++++++
mpn/x86_64/core2/aors_err1_n.asm | 214 +++++++++++++++
mpn/x86_64/core2/gmp-mparam.h | 2 +
mpn/x86_64/gmp-mparam.h | 2 +
mpn/x86_64/mulmid_basecase.asm | 544 +++++++++++++++++++++++++++++++++++++++
tests/devel/try.c | 171 ++++++++++-
tests/refmpn.c | 226 ++++++++++++++++
tests/tests.h | 27 +
tune/Makefile.am | 1 +
tune/common.c | 56 ++++
tune/speed.c | 12 +
tune/speed.h | 178 ++++++++++++-
tune/tuneup.c | 16 +
29 files changed, 3326 insertions(+), 28 deletions(-)
diffs (truncated from 3873 to 300 lines):
diff -r ecd1229d18ed -r 799b8e61a84e ChangeLog
--- a/ChangeLog Mon Oct 03 10:10:19 2011 +0200
+++ b/ChangeLog Mon Oct 03 13:52:31 2011 +0200
@@ -1,3 +1,88 @@
+2011-10-03 Niels Möller <nisse at lysator.liu.se>
+
+ mulmid-related assembly for x86_64, from David Harvey:
+ * mpn/asm-defs.m4 (define_mpn): Added [add,sub]_err[1,2,3]_n and
+ mulmid_basecase. Also use m4_not_for_expansion on the
+ corresponding OPERATION_* symbols.
+ * mpn/x86_64/aors_err1_n.asm: New file.
+ * mpn/x86_64/aors_err2_n.asm: Likewise.
+ * mpn/x86_64/aors_err3_n.asm: Likewise.
+ * mpn/x86_64/mulmid_basecase.asm: Likewise.
+ * mpn/x86_64/core2/aors_err1_n.asm: Likewise.
+ * mpn/x86_64/gmp-mparam.h (MULMID_TOOM42_THRESHOLD): New value.
+ * mpn/x86_64/core2/gmp-mparam.h (MULMID_TOOM42_THRESHOLD): Likewise.
+
+ Tuning of mulmid, from David Harvey:
+ * tune/Makefile.am (TUNE_MPN_SRCS_BASIC): Added mulmid.c
+ mulmid_n.c toom42_mulmid.c.
+ * tune/speed.h: Prototypes for mulmid-related functions.
+ (struct speed_params): Increased max number of sources to 5.
+ (SPEED_ROUTINE_MPN_BINARY_ERR_N_CALL): New macro.
+ (SPEED_ROUTINE_MPN_BINARY_ERR1_N): Likewise.
+ (SPEED_ROUTINE_MPN_BINARY_ERR2_N): Likewise.
+ (SPEED_ROUTINE_MPN_BINARY_ERR3_N): Likewise.
+ (SPEED_ROUTINE_MPN_MULMID): Likewise.
+ (SPEED_ROUTINE_MPN_MULMID_N): Likewise.
+ (SPEED_ROUTINE_MPN_TOOM42_MULMID): Likewise.
+ * tune/common.c (mpn_[add,sub]_err[1,2,3]_n): New functions.
+ (speed_mpn_mulmid_basecase): New function.
+ (speed_mpn_mulmid): New function.
+ (speed_mpn_mulmid_n): New function.
+ (speed_mpn_toom42_mulmid): New function.
+ * tune/speed.c (routine): Added mpn_[add,sub]_err[1,2,3]_n,
+ mpn_mulmid_basecase, mpn_toom42_mulmid, mpn_mulmid_n, and
+ mpn_mulmid.
+ * tune/tuneup.c (mulmid_toom42_threshold): New threshold variable.
+ (tune_mulmid): New function.
+ (all): Call tune_mulmid.
+
+ Testing of mulmid, from David Harvey:
+ * tests/refmpn.c (AORS_ERR1_N): New macro.
+ (refmpn_add_err1_n, refmpn_sub_err1_n): New functions.
+ (AORS_ERR2_N): New macro.
+ (refmpn_add_err2_n, refmpn_sub_err2_n): New functions.
+ (AORS_ERR3_N): New macro.
+ (refmpn_add_err3_n, refmpn_sub_err3_n): New functions.
+ (refmpn_mulmid_basecase): New function.
+ (refmpn_toom42_mulmid): New function, wrapper for
+ refmpn_mulmid_basecase.
+ (refmpn_mulmid_n): Likewise.
+ (refmpn_mulmid): Likewise.
+ * tests/tests.h: Prototypes for new functions.
+ * tests/devel/try.c (NUM_SOURCES): Increased to 5.
+ (struct try_t): Use NUM_SOURCES and NUM_DESTS constants.
+ (SIZE_4, SIZE_6, SIZE_DIFF_PLUS_3, SIZE_ODD): New constants.
+ (OVERLAP_NOT_DST2): New flag.
+ (param_init): New mulmid-related operation types.
+ (mpn_toom42_mulmid_fun): New function.
+ (choice_array): Added mulmid-related entries.
+ (overlap_array): Extended for larger NUM_SOURCES.
+ (OVERLAP_COUNT): Handle OVERLAP_NOT_DST2.
+ (call): Support mulmid-related functions.
+ (pointer_setup): Handle SIZE_4, SIZE_6, and SIZE_DIFF_PLUS_3.
+ (SIZE_ITERATION): Handle SIZE_ODD.
+ (SIZE2_FIRST): Handle SIZE_CEIL_HALF.
+ (SIZE2_LAST): Likewise.
+
+ Implementation of mulmid, from David Harvey:
+ * mpn/generic/add_err1_n.c (mpn_add_err1_n): New file and function.
+ * mpn/generic/add_err2_n.c (mpn_add_err2_n): Likewise.
+ * mpn/generic/add_err3_n.c (mpn_add_err3_n): Likewise.
+ * mpn/generic/sub_err1_n.c (mpn_sub_err1_n): Likewise.
+ * mpn/generic/sub_err2_n.c (mpn_sub_err2_n): Likewise.
+ * mpn/generic/sub_err3_n.c (mpn_sub_err3_n): Likewise.
+ * mpn/generic/mulmid_basecase.c (mpn_mulmid_basecase): Likewise.
+ * mpn/generic/mulmid_n.c (mpn_mulmid_n): Likewise.
+ * mpn/generic/toom42_mulmid.c (mpn_toom42_mulmid): Likewise.
+ * configure.in (gmp_mpn_functions): Added mulmid-related
+ functions.
+ (GMP_MULFUNC_CHOICES): Handle aors_err1_n, aors_err2_n, and
+ aors_err3_n.
+ * gmp-impl.h: Added prototypes for mulmid functions.
+ (MPN_TOOM42_MULMID_MINSIZE): New constant.
+ (MULMID_TOOM42_THRESHOLD): New threshold.
+ (mpn_toom42_mulmid_itch): New macro.
+
2011-10-03 Niels Möller <nisse at lysator.liu.se>
* tune/tune-gcd-p.c (main): Fixed broken loop conditions.
diff -r ecd1229d18ed -r 799b8e61a84e configure.in
--- a/configure.in Mon Oct 03 10:10:19 2011 +0200
+++ b/configure.in Mon Oct 03 13:52:31 2011 +0200
@@ -2527,10 +2527,12 @@
gmp_mpn_functions="$extra_functions \
add add_1 add_n sub sub_1 sub_n addcnd_n subcnd_n neg com \
mul_1 addmul_1 submul_1 \
+ add_err1_n add_err2_n add_err3_n sub_err1_n sub_err2_n sub_err3_n \
lshift rshift dive_1 diveby3 divis divrem divrem_1 divrem_2 \
fib2_ui mod_1 mod_34lsub1 mode1o pre_divrem_1 pre_mod_1 dump \
mod_1_1 mod_1_2 mod_1_3 mod_1_4 lshiftc \
mul mul_fft mul_n sqr mul_basecase sqr_basecase nussbaumer_mul \
+ mulmid_basecase toom42_mulmid mulmid_n mulmid \
random random2 pow_1 \
rootrem sqrtrem get_str set_str scan0 scan1 popcount hamdist cmp \
perfsqr perfpow \
@@ -2571,6 +2573,12 @@
tmp_mulfunc=
case $tmp_fn in
add_n|sub_n) tmp_mulfunc="aors_n" ;;
+ add_err1_n|sub_err1_n)
+ tmp_mulfunc="aors_err1_n" ;;
+ add_err2_n|sub_err2_n)
+ tmp_mulfunc="aors_err2_n" ;;
+ add_err3_n|sub_err3_n)
+ tmp_mulfunc="aors_err3_n" ;;
addmul_1|submul_1) tmp_mulfunc="aorsmul_1" ;;
popcount|hamdist) tmp_mulfunc="popham" ;;
and_n|andn_n|nand_n | ior_n|iorn_n|nior_n | xor_n|xnor_n)
diff -r ecd1229d18ed -r 799b8e61a84e gmp-impl.h
--- a/gmp-impl.h Mon Oct 03 10:10:19 2011 +0200
+++ b/gmp-impl.h Mon Oct 03 13:52:31 2011 +0200
@@ -960,6 +960,24 @@
#define mpn_lshiftc __MPN(lshiftc)
__GMP_DECLSPEC mp_limb_t mpn_lshiftc __GMP_PROTO ((mp_ptr, mp_srcptr, mp_size_t, unsigned int));
+#define mpn_add_err1_n __MPN(add_err1_n)
+__GMP_DECLSPEC mp_limb_t mpn_add_err1_n __GMP_PROTO ((mp_ptr, mp_srcptr, mp_srcptr, mp_ptr, mp_srcptr, mp_size_t, mp_limb_t));
+
+#define mpn_add_err2_n __MPN(add_err2_n)
+__GMP_DECLSPEC mp_limb_t mpn_add_err2_n __GMP_PROTO ((mp_ptr, mp_srcptr, mp_srcptr, mp_ptr, mp_srcptr, mp_srcptr, mp_size_t, mp_limb_t));
+
+#define mpn_add_err3_n __MPN(add_err3_n)
+__GMP_DECLSPEC mp_limb_t mpn_add_err3_n __GMP_PROTO ((mp_ptr, mp_srcptr, mp_srcptr, mp_ptr, mp_srcptr, mp_srcptr, mp_srcptr, mp_size_t, mp_limb_t));
+
+#define mpn_sub_err1_n __MPN(sub_err1_n)
+__GMP_DECLSPEC mp_limb_t mpn_sub_err1_n __GMP_PROTO ((mp_ptr, mp_srcptr, mp_srcptr, mp_ptr, mp_srcptr, mp_size_t, mp_limb_t));
+
+#define mpn_sub_err2_n __MPN(sub_err2_n)
+__GMP_DECLSPEC mp_limb_t mpn_sub_err2_n __GMP_PROTO ((mp_ptr, mp_srcptr, mp_srcptr, mp_ptr, mp_srcptr, mp_srcptr, mp_size_t, mp_limb_t));
+
+#define mpn_sub_err3_n __MPN(sub_err3_n)
+__GMP_DECLSPEC mp_limb_t mpn_sub_err3_n __GMP_PROTO ((mp_ptr, mp_srcptr, mp_srcptr, mp_ptr, mp_srcptr, mp_srcptr, mp_srcptr, mp_size_t, mp_limb_t));
+
#define mpn_add_n_sub_n __MPN(add_n_sub_n)
__GMP_DECLSPEC mp_limb_t mpn_add_n_sub_n __GMP_PROTO ((mp_ptr, mp_ptr, mp_srcptr, mp_srcptr, mp_size_t));
@@ -1031,6 +1049,15 @@
__GMP_DECLSPEC void mpn_sqr_basecase __GMP_PROTO ((mp_ptr, mp_srcptr, mp_size_t));
#endif
+#define mpn_mulmid_basecase __MPN(mulmid_basecase)
+__GMP_DECLSPEC void mpn_mulmid_basecase __GMP_PROTO ((mp_ptr, mp_srcptr, mp_size_t, mp_srcptr, mp_size_t));
+
+#define mpn_mulmid_n __MPN(mulmid_n)
+__GMP_DECLSPEC void mpn_mulmid_n __GMP_PROTO ((mp_ptr, mp_srcptr, mp_srcptr, mp_size_t));
+
+#define mpn_mulmid __MPN(mulmid)
+__GMP_DECLSPEC void mpn_mulmid __GMP_PROTO ((mp_ptr, mp_srcptr, mp_size_t, mp_srcptr, mp_size_t));
+
#define mpn_submul_1c __MPN(submul_1c)
__GMP_DECLSPEC mp_limb_t mpn_submul_1c __GMP_PROTO ((mp_ptr, mp_srcptr, mp_size_t, mp_limb_t, mp_limb_t));
@@ -1185,6 +1212,8 @@
#define MPN_TOOM53_MUL_MINSIZE 49 /* ??? */
#define MPN_TOOM63_MUL_MINSIZE 49
+#define MPN_TOOM42_MULMID_MINSIZE 4
+
#define mpn_sqr_diagonal __MPN(sqr_diagonal)
__GMP_DECLSPEC void mpn_sqr_diagonal __GMP_PROTO ((mp_ptr, mp_srcptr, mp_size_t));
@@ -1283,6 +1312,9 @@
#define mpn_toom8_sqr __MPN(toom8_sqr)
__GMP_DECLSPEC void mpn_toom8_sqr __GMP_PROTO ((mp_ptr, mp_srcptr, mp_size_t, mp_ptr));
+#define mpn_toom42_mulmid __MPN(toom42_mulmid)
+__GMP_DECLSPEC void mpn_toom42_mulmid __GMP_PROTO ((mp_ptr, mp_srcptr, mp_srcptr, mp_size_t, mp_ptr));
+
#define mpn_fft_best_k __MPN(fft_best_k)
__GMP_DECLSPEC int mpn_fft_best_k __GMP_PROTO ((mp_size_t, int)) ATTRIBUTE_CONST;
@@ -1907,6 +1939,10 @@
#define SQR_TOOM3_THRESHOLD_LIMIT SQR_TOOM3_THRESHOLD
#endif
+#ifndef MULMID_TOOM42_THRESHOLD
+#define MULMID_TOOM42_THRESHOLD MUL_TOOM22_THRESHOLD
+#endif
+
#ifndef DC_DIV_QR_THRESHOLD
#define DC_DIV_QR_THRESHOLD 50
#endif
@@ -4535,6 +4571,10 @@
#define MULLO_MUL_N_THRESHOLD mullo_mul_n_threshold
extern mp_size_t mullo_mul_n_threshold;
+#undef MULMID_TOOM42_THRESHOLD
+#define MULMID_TOOM42_THRESHOLD mulmid_toom42_threshold
+extern mp_size_t mulmid_toom42_threshold;
+
#undef DIV_QR_2_PI2_THRESHOLD
#define DIV_QR_2_PI2_THRESHOLD div_qr_2_pi2_threshold
extern mp_size_t div_qr_2_pi2_threshold;
@@ -4838,6 +4878,11 @@
return 9 * n + 3;
}
+/* let S(n) = space required for input size n,
+ then S(n) = 3 floor(n/2) + 1 + S(floor(n/2)). */
+#define mpn_toom42_mulmid_itch(n) \
+ (3 * (n) + GMP_NUMB_BITS)
+
#if 0
#define mpn_fft_mul mpn_mul_fft_full
#else
diff -r ecd1229d18ed -r 799b8e61a84e mpn/asm-defs.m4
--- a/mpn/asm-defs.m4 Mon Oct 03 10:10:19 2011 +0200
+++ b/mpn/asm-defs.m4 Mon Oct 03 13:52:31 2011 +0200
@@ -1054,6 +1054,18 @@
m4_not_for_expansion(`OPERATION_add_n')
m4_not_for_expansion(`OPERATION_sub_n')
+dnl aors_err1_n
+m4_not_for_expansion(`OPERATION_add_err1_n')
+m4_not_for_expansion(`OPERATION_sub_err1_n')
+
+dnl aors_err2_n
+m4_not_for_expansion(`OPERATION_add_err2_n')
+m4_not_for_expansion(`OPERATION_sub_err2_n')
+
+dnl aors_err3_n
+m4_not_for_expansion(`OPERATION_add_err3_n')
+m4_not_for_expansion(`OPERATION_sub_err3_n')
+
dnl aorsmul_1
m4_not_for_expansion(`OPERATION_addmul_1')
m4_not_for_expansion(`OPERATION_submul_1')
@@ -1306,6 +1318,9 @@
define_mpn(add)
define_mpn(add_1)
+define_mpn(add_err1_n)
+define_mpn(add_err2_n)
+define_mpn(add_err3_n)
define_mpn(add_n)
define_mpn(add_nc)
define_mpn(addcnd_n)
@@ -1400,6 +1415,7 @@
define_mpn(mul_basecase)
define_mpn(mul_n)
define_mpn(mullo_basecase)
+define_mpn(mulmid_basecase)
define_mpn(perfect_square_p)
define_mpn(popcount)
define_mpn(preinv_divrem_1)
@@ -1448,6 +1464,9 @@
define_mpn(sqrtrem)
define_mpn(sub)
define_mpn(sub_1)
+define_mpn(sub_err1_n)
+define_mpn(sub_err2_n)
+define_mpn(sub_err3_n)
define_mpn(sub_n)
define_mpn(sub_nc)
define_mpn(submul_1)
diff -r ecd1229d18ed -r 799b8e61a84e mpn/generic/add_err1_n.c
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/mpn/generic/add_err1_n.c Mon Oct 03 13:52:31 2011 +0200
@@ -0,0 +1,90 @@
+/* mpn_add_err1_n -- add_n with one error term
+
+ Contributed by David Harvey.
+
+ THE FUNCTION IN THIS FILE IS INTERNAL WITH A MUTABLE INTERFACE. IT IS ONLY
+ SAFE TO REACH IT THROUGH DOCUMENTED INTERFACES. IN FACT, IT IS ALMOST
+ GUARANTEED THAT IT'LL CHANGE OR DISAPPEAR IN A FUTURE GNU MP RELEASE.
+
+Copyright 2011 Free Software Foundation, Inc.
+
+This file is part of the GNU MP Library.
+
+The GNU MP Library is free software; you can redistribute it and/or modify
+it under the terms of the GNU Lesser General Public License as published by
+the Free Software Foundation; either version 3 of the License, or (at your
+option) any later version.
+
+The GNU MP Library is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public
+License for more details.
+
+You should have received a copy of the GNU Lesser General Public License
+along with the GNU MP Library. If not, see http://www.gnu.org/licenses/. */
+
+#include "gmp.h"
+#include "gmp-impl.h"
+
More information about the gmp-commit
mailing list