[Gmp-commit] /home/hgfiles/gmp: 6 new changesets

Wed Dec 30 00:06:47 CET 2009

details:   /home/hgfiles/gmp/rev/78339b4e5dde
changeset: 13256:78339b4e5dde
user:      Torbjorn Granlund <tege at gmplib.org>
date:      Tue Dec 29 23:34:20 2009 +0100
description:
Call mpn_sqr directly.

details:   /home/hgfiles/gmp/rev/7d1247f62460
changeset: 13257:7d1247f62460
user:      Torbjorn Granlund <tege at gmplib.org>
date:      Tue Dec 29 23:37:20 2009 +0100
description:
(mpn_mu_div_qr2): Remove code for dn==1.

details:   /home/hgfiles/gmp/rev/851465cf853f
changeset: 13258:851465cf853f
user:      Torbjorn Granlund <tege at gmplib.org>
date:      Tue Dec 29 23:38:39 2009 +0100
description:
Use larger test operands.

details:   /home/hgfiles/gmp/rev/44ad09cd53a5
changeset: 13259:44ad09cd53a5
user:      Torbjorn Granlund <tege at gmplib.org>
date:      Tue Dec 29 23:51:17 2009 +0100
description:
Call mpn_mu_bdiv_qr.

details:   /home/hgfiles/gmp/rev/a67d233b1fd0
changeset: 13260:a67d233b1fd0
user:      Torbjorn Granlund <tege at gmplib.org>
date:      Tue Dec 29 23:54:13 2009 +0100
description:
Tune MUPI_DIV_QR_THRESHOLD.

details:   /home/hgfiles/gmp/rev/ffa277747587
changeset: 13261:ffa277747587
user:      Torbjorn Granlund <tege at gmplib.org>
date:      Wed Dec 30 00:06:40 2009 +0100
description:
List more changes.

diffstat:

 ChangeLog               |  13 ++++++
 NEWS                    |  96 ++++++++++++++++++++++++++++--------------------
 gmp-impl.h              |   4 ++
 mpn/generic/mu_div_qr.c |  13 +-----
 mpn/generic/tdiv_qr.c   |  31 ++++++++++-----
 mpz/mul.c               |  40 ++++++++++++-------
 tests/mpz/t-tdiv.c      |   2 +-
 tune/common.c           |   5 ++
 tune/speed.h            |  49 ++++++++++++++++++++++++-
 tune/tuneup.c           |  11 +++++
 10 files changed, 186 insertions(+), 78 deletions(-)

diffs (truncated from 480 to 300 lines):

diff -r 8b96e43ff299 -r ffa277747587 ChangeLog

--- a/ChangeLog	Tue Dec 29 14:41:44 2009 +0100
+++ b/ChangeLog	Wed Dec 30 00:06:40 2009 +0100
@@ -1,5 +1,18 @@
 2009-12-29  Torbjorn Granlund  <tege at gmplib.org>
 
+	* tune/tuneup.c (tune_mu_div): Tune MUPI_DIV_QR_THRESHOLD.
+	* tune/speed.h (struct speed_params): Allow 3 source operands.
+	(SPEED_ROUTINE_MPN_MUPI_DIV_QR): New macro.
+	* tune/common.c (speed_mpn_mupi_div_qr): New function.
+
+	* mpn/generic/tdiv_qr.c: Call mpn_mu_bdiv_qr.
+
+	* tests/mpz/t-tdiv.c: Use larger test operands.
+
+	* mpn/generic/mu_div_qr.c (mpn_mu_div_qr2): Remove code for dn==1.
+
+	* mpz/mul.c: Call mpn_sqr directly.  Use PTR,SIZ,ALLOC.
+
 	* tune/tuneup.c (tune_mu_div): Set min_size to 6, DC functions require
 	this.
 
diff -r 8b96e43ff299 -r ffa277747587 NEWS
--- a/NEWS	Tue Dec 29 14:41:44 2009 +0100
+++ b/NEWS	Wed Dec 30 00:06:40 2009 +0100
@@ -1,56 +1,79 @@
 Copyright 1996, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008,
 2009 Free Software Foundation, Inc.
 
-This file is part of the GNU MP Library.
+Verbatim copying and distribution of this entire article is permitted
+in any medium, provided this notice is preserved.
 
-The GNU MP Library is free software; you can redistribute it and/or modify
-it under the terms of the GNU Lesser General Public License as published by
-the Free Software Foundation; either version 3 of the License, or (at your
-option) any later version.
 
-The GNU MP Library is distributed in the hope that it will be useful, but
-WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
-or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public
-License for more details.
+Changes between GMP version 4.3.X and 4.4.0.
 
-You should have received a copy of the GNU Lesser General Public License
-along with the GNU MP Library.  If not, see http://www.gnu.org/licenses/.
+  BUGS FIXED
+  * None (contains the same fixes as release 4.3.2).
 
+  SPEEDUPS
+  * Multiplication has been overhauled:
+    (1) Multiplication of larger same size operands has been improved with
+        the addition of new 2 Toom functions and a new internal function
+        mpn_mulmod_bnm1 (computing U * V mod (B^n-1), B being the word base.
+    (2) Multiplication of different size operands has been improved with the
+        addition of many new Toom function, and by selecting underlying
+        function better from the main multiplicy functions.
 
-Changes between GMP version 4.3.X and 4.4.0
+  * Division has been overhauled, for both small and large operands:
+    (1) Plain "schoolbook" division is reimplemented using faster quotient
+        approximation.
+    (2) Division Q = N/D, R = N mod D where both the quotient and remainder
+        are needed now runs in time O(M(log(N))).  This is an improvement of
+        a factor log(log(N))
+    (3) Division where just the quotient needed is now O(M(log(Q))) on
+        average.
 
-  Bugs:
-  * None.
-
-  Speedups:
-  * Multiplication has been overhauled, and the speed has been substantially
-    improved for most operand sizes.
-  * A new algorithm makes mpz_perfect_power_p asymptotically faster.
   * The function mpz_powm is now faster for all sizes.  Its complexity has
     gone from O(M(n)log(n)m) to O(M(n)m) where n is the size of the modulo
     argument and m is the size of the exponent.  It is also radically
     faster for even modulus, since it now partially factors such modulus
-    and performs two smaller expod operations, then CRT.
+    and performs two smaller expod operations, then uses CRT.
+
   * The internal support for multiplication yielding just the lower n limbs
-    has been improved by using Mulder's algorithm.
-  * Division N/D where both the quotient and remainder is needed now runs
-    in time O(M(log(N))).  This is asymptotically optimal, as far as it is
-    known.
-  * Division where just the quotient is asked for is now O(M(log(Q))) on
-    average, where Q is the quotient.
+    has been improved by using Mulders' algorithm.
 
-  Features:
-  * New mpn functions: mpn_sqr, mpn_and_n, mpn_ior_n, mpn_xor_n,
-    mpn_nand_n, mpn_nior_n, mpn_xnor_n, mpn_andn_n, mpn_iorn_n.
-  * Support for fat binaries for 64-bit x86 processors.
-  * New type, mp_bitcnt_t for bignum bit counts.
+  * Computation of inverses, both plain 1/N and 1/N mod B^n have been
+    improved by using well-tuned Newton iterations, and wrap-around
+    multiplication using mpn_mulmod_bnm1.
+
+  * Exact division Q = N/D by means of mpz_divexact has been improved for
+    all sizes, and now runs in time O(M(log(N))).
+
+  * A new algorithm makes mpz_perfect_power_p asymptotically faster.
+
+  * The function mpz_remove uses a much faster algorithm, is better tuned,
+    and also benefits from the division improvements.
+
+  * Plus hundreds of smaller improvements and tweaks!
+
+  FEATURES
+  * New mpz function: mpz_powm_sec for side-channel quiet modexp
+    computations.
+
+  * New mpn functions: mpn_sqr, mpn_and_n, mpn_ior_n, mpn_xor_n, mpn_nand_n,
+    mpn_nior_n, mpn_xnor_n, mpn_andn_n, mpn_iorn_n, mpn_com, mpn_neg,
+    mpn_copyi, mpn_copyd, mpn_zero.
+
+  * Support for fat binaries for 64-bit x86 processors has been added.
+
+  * A new type, mp_bitcnt_t for bignum bit counts, has been introduced.
+
   * The cofactors of mpz_gcdext and mpn_gcdext are now more strictly
-    normalised, returning to how GMP 4.2 worked.
+    normalised, returning to how GMP 4.2 worked.  (Note that also release
+    4.3.2 has this change.)
 
-  Misc:
+  MISC
   * The mpn_bdivmod function has been removed.  We do not consider this an
     incompatible change, since the function was marked as preliminary.
 
+  * The teststuite has been enhanced in various ways.
+
+
 Changes between GMP version 4.3.1 and 4.3.2
 
   Bugs:
@@ -485,11 +508,3 @@
 * mpz division functions round differently.
 * mpz mod functions now really compute mod.
 * mpz_powm and mpz_powm_ui now really use mod for reduction.
-
-
-
-----------------
-Local variables:
-mode: text
-fill-column: 76
-End:
diff -r 8b96e43ff299 -r ffa277747587 gmp-impl.h
--- a/gmp-impl.h	Tue Dec 29 14:41:44 2009 +0100
+++ b/gmp-impl.h	Wed Dec 30 00:06:40 2009 +0100
@@ -1798,6 +1798,10 @@
 #define MU_DIV_QR_THRESHOLD            2000
 #endif
 
+#ifndef MUPI_DIV_QR_THRESHOLD
+#define MUPI_DIV_QR_THRESHOLD           200
+#endif
+
 #ifndef MU_BDIV_Q_THRESHOLD
 #define MU_BDIV_Q_THRESHOLD            2000
 #endif
diff -r 8b96e43ff299 -r ffa277747587 mpn/generic/mu_div_qr.c
--- a/mpn/generic/mu_div_qr.c	Tue Dec 29 14:41:44 2009 +0100
+++ b/mpn/generic/mu_div_qr.c	Wed Dec 30 00:06:40 2009 +0100
@@ -84,7 +84,7 @@
    really make sense?  It seem like the quotient between dn and qn might be
    what we really should be checking.  */
 #ifndef MU_DIV_QR_SKEW_THRESHOLD
-#define MU_DIV_QR_SKEW_THRESHOLD 100	/* FIXME: somewhat arbitrary */
+#define MU_DIV_QR_SKEW_THRESHOLD 100
 #endif
 
 #ifdef CHECK				/* FIXME: Enable in minithres */
@@ -170,15 +170,6 @@
   mp_limb_t cy, qh;
   mp_ptr ip, tp;
 
-  /* FIXME: We should probably not handle tiny operands, but do it for now.  */
-  if (dn == 1)
-    {
-      ASSERT_ALWAYS (dn > 1);
-      rp[0] = mpn_divrem_1 (scratch, 0L, np, nn, dp[0]);
-      MPN_COPY (qp, scratch, nn - 1);
-      return scratch[nn - 1];
-    }
-
   qn = nn - dn;
 
   /* Compute the inverse size.  */
@@ -295,7 +286,7 @@
 	 by the upper part of the partial remainder R.  */
       mpn_mul_n (tp, rp + dn - in, ip, in);		/* mulhi  */
       cy = mpn_add_n (qp, tp + in, rp + dn - in, in);	/* I's msb implicit */
-      ASSERT_ALWAYS (cy == 0);			/* FIXME */
+      ASSERT_ALWAYS (cy == 0);
 
       /* Compute the product of the quotient block and the divisor D, to be
 	 subtracted from the partial remainder combined with new limbs from the
diff -r 8b96e43ff299 -r ffa277747587 mpn/generic/tdiv_qr.c
--- a/mpn/generic/tdiv_qr.c	Tue Dec 29 14:41:44 2009 +0100
+++ b/mpn/generic/tdiv_qr.c	Wed Dec 30 00:06:40 2009 +0100
@@ -12,7 +12,7 @@
    The time complexity of this is O(qn*qn+M(dn,qn)), where M(m,n) is the time
    complexity of multiplication.
 
-Copyright 1997, 2000, 2001, 2002, 2005 Free Software Foundation, Inc.
+Copyright 1997, 2000, 2001, 2002, 2005, 2009 Free Software Foundation, Inc.
 
 This file is part of the GNU MP Library.
 
@@ -38,13 +38,8 @@
 mpn_tdiv_qr (mp_ptr qp, mp_ptr rp, mp_size_t qxn,
 	     mp_srcptr np, mp_size_t nn, mp_srcptr dp, mp_size_t dn)
 {
-  /* FIXME:
-     1. qxn
-     2. pass allocated storage in additional parameter?
-  */
   ASSERT_ALWAYS (qxn == 0);
 
-  ASSERT (qxn >= 0);
   ASSERT (nn >= 0);
   ASSERT (dn >= 0);
   ASSERT (dn == 0 || dp[dn - 1] != 0);
@@ -137,10 +132,18 @@
 	      }
 
 	    invert_pi1 (dinv, d2p[dn - 1], d2p[dn - 2]);
-	    if (dn < DC_DIV_QR_THRESHOLD)
+	    if (BELOW_THRESHOLD (dn, DC_DIV_QR_THRESHOLD))
 	      mpn_sbpi1_div_qr (qp, n2p, nn, d2p, dn, dinv.inv32);
+	    else if (BELOW_THRESHOLD (dn, MU_DIV_QR_THRESHOLD))
+	      mpn_dcpi1_div_qr (qp, n2p, nn, d2p, dn, &dinv);
 	    else
-	      mpn_dcpi1_div_qr (qp, n2p, nn, d2p, dn, &dinv);
+	      {
+		mp_size_t itch = mpn_mu_div_qr_itch (nn, dn, 0);
+		mp_ptr scratch = TMP_ALLOC_LIMBS (itch);
+		rp = n2p + nn - dn;
+		mpn_mu_div_qr (qp, rp, n2p, nn, d2p, dn, scratch);
+		MPN_COPY (n2p, rp, dn);
+	      }
 
 	    if (cnt != 0)
 	      mpn_rshift (rp, n2p, dn, cnt);
@@ -258,10 +261,18 @@
 	      {
 		gmp_pi1_t dinv;
 		invert_pi1 (dinv, d2p[qn - 1], d2p[qn - 2]);
-		if (qn < DC_DIV_QR_THRESHOLD)
+		if (BELOW_THRESHOLD (qn, DC_DIV_QR_THRESHOLD))
 		  mpn_sbpi1_div_qr (qp, n2p, 2 * qn, d2p, qn, dinv.inv32);
+		else if (BELOW_THRESHOLD (qn, MU_DIV_QR_THRESHOLD))
+		  mpn_dcpi1_div_qr (qp, n2p, 2 * qn, d2p, qn, &dinv);
 		else
-		  mpn_dcpi1_div_qr (qp, n2p, 2 * qn, d2p, qn, &dinv);
+		  {
+		    mp_size_t itch = mpn_mu_div_qr_itch (2 * qn, qn, 0);
+		    mp_ptr scratch = TMP_ALLOC_LIMBS (itch);
+		    rp = n2p + 2 * qn - qn;
+		    mpn_mu_div_qr (qp, rp, n2p, 2 * qn, d2p, qn, scratch);
+		    MPN_COPY (n2p, rp, qn);
+		  }
 	      }
 
 	    rn = qn;
diff -r 8b96e43ff299 -r ffa277747587 mpz/mul.c
--- a/mpz/mul.c	Tue Dec 29 14:41:44 2009 +0100
+++ b/mpz/mul.c	Wed Dec 30 00:06:40 2009 +0100
@@ -1,7 +1,7 @@
 /* mpz_mul -- Multiply two integers.
 
-Copyright 1991, 1993, 1994, 1996, 2000, 2001, 2005 Free Software Foundation,
-Inc.
+Copyright 1991, 1993, 1994, 1996, 2000, 2001, 2005, 2009 Free Software
+Foundation, Inc.
 
 This file is part of the GNU MP Library.
 
@@ -33,8 +33,8 @@
 mult (mpz_srcptr u, mpz_srcptr v, mpz_ptr w)
 #endif /* BERKELEY_MP */
 {
-  mp_size_t usize = u->_mp_size;
-  mp_size_t vsize = v->_mp_size;
+  mp_size_t usize = SIZ(u);
+  mp_size_t vsize = SIZ(v);
   mp_size_t wsize;
   mp_size_t sign_product;
   mp_ptr up, vp;
@@ -92,25 +92,25 @@
 
   TMP_MARK;
   free_me = NULL;
-  up = u->_mp_d;
-  vp = v->_mp_d;
-  wp = w->_mp_d;