[Gmp-commit] /home/hgfiles/gmp: 6 new changesets
mercurial at gmplib.org
mercurial at gmplib.org
Wed Dec 30 00:06:47 CET 2009
details: /home/hgfiles/gmp/rev/78339b4e5dde
changeset: 13256:78339b4e5dde
user: Torbjorn Granlund <tege at gmplib.org>
date: Tue Dec 29 23:34:20 2009 +0100
description:
Call mpn_sqr directly.
details: /home/hgfiles/gmp/rev/7d1247f62460
changeset: 13257:7d1247f62460
user: Torbjorn Granlund <tege at gmplib.org>
date: Tue Dec 29 23:37:20 2009 +0100
description:
(mpn_mu_div_qr2): Remove code for dn==1.
details: /home/hgfiles/gmp/rev/851465cf853f
changeset: 13258:851465cf853f
user: Torbjorn Granlund <tege at gmplib.org>
date: Tue Dec 29 23:38:39 2009 +0100
description:
Use larger test operands.
details: /home/hgfiles/gmp/rev/44ad09cd53a5
changeset: 13259:44ad09cd53a5
user: Torbjorn Granlund <tege at gmplib.org>
date: Tue Dec 29 23:51:17 2009 +0100
description:
Call mpn_mu_bdiv_qr.
details: /home/hgfiles/gmp/rev/a67d233b1fd0
changeset: 13260:a67d233b1fd0
user: Torbjorn Granlund <tege at gmplib.org>
date: Tue Dec 29 23:54:13 2009 +0100
description:
Tune MUPI_DIV_QR_THRESHOLD.
details: /home/hgfiles/gmp/rev/ffa277747587
changeset: 13261:ffa277747587
user: Torbjorn Granlund <tege at gmplib.org>
date: Wed Dec 30 00:06:40 2009 +0100
description:
List more changes.
diffstat:
ChangeLog | 13 ++++++
NEWS | 96 ++++++++++++++++++++++++++++--------------------
gmp-impl.h | 4 ++
mpn/generic/mu_div_qr.c | 13 +-----
mpn/generic/tdiv_qr.c | 31 ++++++++++-----
mpz/mul.c | 40 ++++++++++++-------
tests/mpz/t-tdiv.c | 2 +-
tune/common.c | 5 ++
tune/speed.h | 49 ++++++++++++++++++++++++-
tune/tuneup.c | 11 +++++
10 files changed, 186 insertions(+), 78 deletions(-)
diffs (truncated from 480 to 300 lines):
diff -r 8b96e43ff299 -r ffa277747587 ChangeLog
--- a/ChangeLog Tue Dec 29 14:41:44 2009 +0100
+++ b/ChangeLog Wed Dec 30 00:06:40 2009 +0100
@@ -1,5 +1,18 @@
2009-12-29 Torbjorn Granlund <tege at gmplib.org>
+ * tune/tuneup.c (tune_mu_div): Tune MUPI_DIV_QR_THRESHOLD.
+ * tune/speed.h (struct speed_params): Allow 3 source operands.
+ (SPEED_ROUTINE_MPN_MUPI_DIV_QR): New macro.
+ * tune/common.c (speed_mpn_mupi_div_qr): New function.
+
+ * mpn/generic/tdiv_qr.c: Call mpn_mu_bdiv_qr.
+
+ * tests/mpz/t-tdiv.c: Use larger test operands.
+
+ * mpn/generic/mu_div_qr.c (mpn_mu_div_qr2): Remove code for dn==1.
+
+ * mpz/mul.c: Call mpn_sqr directly. Use PTR,SIZ,ALLOC.
+
* tune/tuneup.c (tune_mu_div): Set min_size to 6, DC functions require
this.
diff -r 8b96e43ff299 -r ffa277747587 NEWS
--- a/NEWS Tue Dec 29 14:41:44 2009 +0100
+++ b/NEWS Wed Dec 30 00:06:40 2009 +0100
@@ -1,56 +1,79 @@
Copyright 1996, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008,
2009 Free Software Foundation, Inc.
-This file is part of the GNU MP Library.
+Verbatim copying and distribution of this entire article is permitted
+in any medium, provided this notice is preserved.
-The GNU MP Library is free software; you can redistribute it and/or modify
-it under the terms of the GNU Lesser General Public License as published by
-the Free Software Foundation; either version 3 of the License, or (at your
-option) any later version.
-The GNU MP Library is distributed in the hope that it will be useful, but
-WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
-or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public
-License for more details.
+Changes between GMP version 4.3.X and 4.4.0.
-You should have received a copy of the GNU Lesser General Public License
-along with the GNU MP Library. If not, see http://www.gnu.org/licenses/.
+ BUGS FIXED
+ * None (contains the same fixes as release 4.3.2).
+ SPEEDUPS
+ * Multiplication has been overhauled:
+ (1) Multiplication of larger same size operands has been improved with
+ the addition of new 2 Toom functions and a new internal function
+ mpn_mulmod_bnm1 (computing U * V mod (B^n-1), B being the word base.
+ (2) Multiplication of different size operands has been improved with the
+ addition of many new Toom function, and by selecting underlying
+ function better from the main multiplicy functions.
-Changes between GMP version 4.3.X and 4.4.0
+ * Division has been overhauled, for both small and large operands:
+ (1) Plain "schoolbook" division is reimplemented using faster quotient
+ approximation.
+ (2) Division Q = N/D, R = N mod D where both the quotient and remainder
+ are needed now runs in time O(M(log(N))). This is an improvement of
+ a factor log(log(N))
+ (3) Division where just the quotient needed is now O(M(log(Q))) on
+ average.
- Bugs:
- * None.
-
- Speedups:
- * Multiplication has been overhauled, and the speed has been substantially
- improved for most operand sizes.
- * A new algorithm makes mpz_perfect_power_p asymptotically faster.
* The function mpz_powm is now faster for all sizes. Its complexity has
gone from O(M(n)log(n)m) to O(M(n)m) where n is the size of the modulo
argument and m is the size of the exponent. It is also radically
faster for even modulus, since it now partially factors such modulus
- and performs two smaller expod operations, then CRT.
+ and performs two smaller expod operations, then uses CRT.
+
* The internal support for multiplication yielding just the lower n limbs
- has been improved by using Mulder's algorithm.
- * Division N/D where both the quotient and remainder is needed now runs
- in time O(M(log(N))). This is asymptotically optimal, as far as it is
- known.
- * Division where just the quotient is asked for is now O(M(log(Q))) on
- average, where Q is the quotient.
+ has been improved by using Mulders' algorithm.
- Features:
- * New mpn functions: mpn_sqr, mpn_and_n, mpn_ior_n, mpn_xor_n,
- mpn_nand_n, mpn_nior_n, mpn_xnor_n, mpn_andn_n, mpn_iorn_n.
- * Support for fat binaries for 64-bit x86 processors.
- * New type, mp_bitcnt_t for bignum bit counts.
+ * Computation of inverses, both plain 1/N and 1/N mod B^n have been
+ improved by using well-tuned Newton iterations, and wrap-around
+ multiplication using mpn_mulmod_bnm1.
+
+ * Exact division Q = N/D by means of mpz_divexact has been improved for
+ all sizes, and now runs in time O(M(log(N))).
+
+ * A new algorithm makes mpz_perfect_power_p asymptotically faster.
+
+ * The function mpz_remove uses a much faster algorithm, is better tuned,
+ and also benefits from the division improvements.
+
+ * Plus hundreds of smaller improvements and tweaks!
+
+ FEATURES
+ * New mpz function: mpz_powm_sec for side-channel quiet modexp
+ computations.
+
+ * New mpn functions: mpn_sqr, mpn_and_n, mpn_ior_n, mpn_xor_n, mpn_nand_n,
+ mpn_nior_n, mpn_xnor_n, mpn_andn_n, mpn_iorn_n, mpn_com, mpn_neg,
+ mpn_copyi, mpn_copyd, mpn_zero.
+
+ * Support for fat binaries for 64-bit x86 processors has been added.
+
+ * A new type, mp_bitcnt_t for bignum bit counts, has been introduced.
+
* The cofactors of mpz_gcdext and mpn_gcdext are now more strictly
- normalised, returning to how GMP 4.2 worked.
+ normalised, returning to how GMP 4.2 worked. (Note that also release
+ 4.3.2 has this change.)
- Misc:
+ MISC
* The mpn_bdivmod function has been removed. We do not consider this an
incompatible change, since the function was marked as preliminary.
+ * The teststuite has been enhanced in various ways.
+
+
Changes between GMP version 4.3.1 and 4.3.2
Bugs:
@@ -485,11 +508,3 @@
* mpz division functions round differently.
* mpz mod functions now really compute mod.
* mpz_powm and mpz_powm_ui now really use mod for reduction.
-
-
-
-----------------
-Local variables:
-mode: text
-fill-column: 76
-End:
diff -r 8b96e43ff299 -r ffa277747587 gmp-impl.h
--- a/gmp-impl.h Tue Dec 29 14:41:44 2009 +0100
+++ b/gmp-impl.h Wed Dec 30 00:06:40 2009 +0100
@@ -1798,6 +1798,10 @@
#define MU_DIV_QR_THRESHOLD 2000
#endif
+#ifndef MUPI_DIV_QR_THRESHOLD
+#define MUPI_DIV_QR_THRESHOLD 200
+#endif
+
#ifndef MU_BDIV_Q_THRESHOLD
#define MU_BDIV_Q_THRESHOLD 2000
#endif
diff -r 8b96e43ff299 -r ffa277747587 mpn/generic/mu_div_qr.c
--- a/mpn/generic/mu_div_qr.c Tue Dec 29 14:41:44 2009 +0100
+++ b/mpn/generic/mu_div_qr.c Wed Dec 30 00:06:40 2009 +0100
@@ -84,7 +84,7 @@
really make sense? It seem like the quotient between dn and qn might be
what we really should be checking. */
#ifndef MU_DIV_QR_SKEW_THRESHOLD
-#define MU_DIV_QR_SKEW_THRESHOLD 100 /* FIXME: somewhat arbitrary */
+#define MU_DIV_QR_SKEW_THRESHOLD 100
#endif
#ifdef CHECK /* FIXME: Enable in minithres */
@@ -170,15 +170,6 @@
mp_limb_t cy, qh;
mp_ptr ip, tp;
- /* FIXME: We should probably not handle tiny operands, but do it for now. */
- if (dn == 1)
- {
- ASSERT_ALWAYS (dn > 1);
- rp[0] = mpn_divrem_1 (scratch, 0L, np, nn, dp[0]);
- MPN_COPY (qp, scratch, nn - 1);
- return scratch[nn - 1];
- }
-
qn = nn - dn;
/* Compute the inverse size. */
@@ -295,7 +286,7 @@
by the upper part of the partial remainder R. */
mpn_mul_n (tp, rp + dn - in, ip, in); /* mulhi */
cy = mpn_add_n (qp, tp + in, rp + dn - in, in); /* I's msb implicit */
- ASSERT_ALWAYS (cy == 0); /* FIXME */
+ ASSERT_ALWAYS (cy == 0);
/* Compute the product of the quotient block and the divisor D, to be
subtracted from the partial remainder combined with new limbs from the
diff -r 8b96e43ff299 -r ffa277747587 mpn/generic/tdiv_qr.c
--- a/mpn/generic/tdiv_qr.c Tue Dec 29 14:41:44 2009 +0100
+++ b/mpn/generic/tdiv_qr.c Wed Dec 30 00:06:40 2009 +0100
@@ -12,7 +12,7 @@
The time complexity of this is O(qn*qn+M(dn,qn)), where M(m,n) is the time
complexity of multiplication.
-Copyright 1997, 2000, 2001, 2002, 2005 Free Software Foundation, Inc.
+Copyright 1997, 2000, 2001, 2002, 2005, 2009 Free Software Foundation, Inc.
This file is part of the GNU MP Library.
@@ -38,13 +38,8 @@
mpn_tdiv_qr (mp_ptr qp, mp_ptr rp, mp_size_t qxn,
mp_srcptr np, mp_size_t nn, mp_srcptr dp, mp_size_t dn)
{
- /* FIXME:
- 1. qxn
- 2. pass allocated storage in additional parameter?
- */
ASSERT_ALWAYS (qxn == 0);
- ASSERT (qxn >= 0);
ASSERT (nn >= 0);
ASSERT (dn >= 0);
ASSERT (dn == 0 || dp[dn - 1] != 0);
@@ -137,10 +132,18 @@
}
invert_pi1 (dinv, d2p[dn - 1], d2p[dn - 2]);
- if (dn < DC_DIV_QR_THRESHOLD)
+ if (BELOW_THRESHOLD (dn, DC_DIV_QR_THRESHOLD))
mpn_sbpi1_div_qr (qp, n2p, nn, d2p, dn, dinv.inv32);
+ else if (BELOW_THRESHOLD (dn, MU_DIV_QR_THRESHOLD))
+ mpn_dcpi1_div_qr (qp, n2p, nn, d2p, dn, &dinv);
else
- mpn_dcpi1_div_qr (qp, n2p, nn, d2p, dn, &dinv);
+ {
+ mp_size_t itch = mpn_mu_div_qr_itch (nn, dn, 0);
+ mp_ptr scratch = TMP_ALLOC_LIMBS (itch);
+ rp = n2p + nn - dn;
+ mpn_mu_div_qr (qp, rp, n2p, nn, d2p, dn, scratch);
+ MPN_COPY (n2p, rp, dn);
+ }
if (cnt != 0)
mpn_rshift (rp, n2p, dn, cnt);
@@ -258,10 +261,18 @@
{
gmp_pi1_t dinv;
invert_pi1 (dinv, d2p[qn - 1], d2p[qn - 2]);
- if (qn < DC_DIV_QR_THRESHOLD)
+ if (BELOW_THRESHOLD (qn, DC_DIV_QR_THRESHOLD))
mpn_sbpi1_div_qr (qp, n2p, 2 * qn, d2p, qn, dinv.inv32);
+ else if (BELOW_THRESHOLD (qn, MU_DIV_QR_THRESHOLD))
+ mpn_dcpi1_div_qr (qp, n2p, 2 * qn, d2p, qn, &dinv);
else
- mpn_dcpi1_div_qr (qp, n2p, 2 * qn, d2p, qn, &dinv);
+ {
+ mp_size_t itch = mpn_mu_div_qr_itch (2 * qn, qn, 0);
+ mp_ptr scratch = TMP_ALLOC_LIMBS (itch);
+ rp = n2p + 2 * qn - qn;
+ mpn_mu_div_qr (qp, rp, n2p, 2 * qn, d2p, qn, scratch);
+ MPN_COPY (n2p, rp, qn);
+ }
}
rn = qn;
diff -r 8b96e43ff299 -r ffa277747587 mpz/mul.c
--- a/mpz/mul.c Tue Dec 29 14:41:44 2009 +0100
+++ b/mpz/mul.c Wed Dec 30 00:06:40 2009 +0100
@@ -1,7 +1,7 @@
/* mpz_mul -- Multiply two integers.
-Copyright 1991, 1993, 1994, 1996, 2000, 2001, 2005 Free Software Foundation,
-Inc.
+Copyright 1991, 1993, 1994, 1996, 2000, 2001, 2005, 2009 Free Software
+Foundation, Inc.
This file is part of the GNU MP Library.
@@ -33,8 +33,8 @@
mult (mpz_srcptr u, mpz_srcptr v, mpz_ptr w)
#endif /* BERKELEY_MP */
{
- mp_size_t usize = u->_mp_size;
- mp_size_t vsize = v->_mp_size;
+ mp_size_t usize = SIZ(u);
+ mp_size_t vsize = SIZ(v);
mp_size_t wsize;
mp_size_t sign_product;
mp_ptr up, vp;
@@ -92,25 +92,25 @@
TMP_MARK;
free_me = NULL;
- up = u->_mp_d;
- vp = v->_mp_d;
- wp = w->_mp_d;
More information about the gmp-commit
mailing list