[Gmp-commit] /home/hgfiles/gmp: 6 new changesets

Thu Dec 3 16:43:23 CET 2009

details:   /home/hgfiles/gmp/rev/fcf35df93482
changeset: 12965:fcf35df93482
user:      Niels M?ller <nisse at lysator.liu.se>
date:      Wed Dec 02 15:40:34 2009 +0100
description:
Signed cofactors from gcdext_1. Updated gcdext_lehmer.

details:   /home/hgfiles/gmp/rev/b23660384f7c
changeset: 12966:b23660384f7c
user:      Niels M?ller <nisse at lysator.liu.se>
date:      Wed Dec 02 16:53:33 2009 +0100
description:
Require gcdext to return canonical cofactors.

details:   /home/hgfiles/gmp/rev/bdbe1b36c5d9
changeset: 12967:bdbe1b36c5d9
user:      Niels M?ller <nisse at lysator.liu.se>
date:      Thu Dec 03 12:10:38 2009 +0100
description:
Better cofactor selection when we find that A == B.

details:   /home/hgfiles/gmp/rev/12704b10a1b8
changeset: 12968:12704b10a1b8
user:      Niels M?ller <nisse at lysator.liu.se>
date:      Thu Dec 03 12:33:03 2009 +0100
description:
Better cofactor selection in outer gcdext loop.

details:   /home/hgfiles/gmp/rev/97fe8f1135eb
changeset: 12969:97fe8f1135eb
user:      Niels M?ller <nisse at lysator.liu.se>
date:      Thu Dec 03 12:43:13 2009 +0100
description:
Comment and (c) year updates.

details:   /home/hgfiles/gmp/rev/eb98bbab31f4
changeset: 12970:eb98bbab31f4
user:      Niels M?ller <nisse at lysator.liu.se>
date:      Thu Dec 03 16:43:17 2009 +0100
description:
Merged code for better canonicalization of gcdext cofactors.

diffstat:

 ChangeLog                        |   46 ++++++-
 doc/gmp.texi                     |   35 +++-
 gmp-h.in                         |    2 +-
 gmp-impl.h                       |    8 +
 mpn/generic/gcdext.c             |   36 ++--
 mpn/generic/gcdext_1.c           |  276 ++-------------------------------------
 mpn/generic/gcdext_lehmer.c      |   72 +++++++--
 mpn/generic/gcdext_subdiv_step.c |   29 +++-
 tests/devel/try.c                |   21 +++
 tests/mpz/t-gcd.c                |   42 ++++-
 tests/refmpn.c                   |   20 ++-
 tests/tests.h                    |    2 +
 tune/Makefile.am                 |    2 +-
 tune/common.c                    |   22 +++
 tune/speed.h                     |   49 ++++++-
 tune/tuneup.c                    |   36 +++-
 16 files changed, 360 insertions(+), 338 deletions(-)

diffs (truncated from 1082 to 300 lines):

diff -r ee9ad3199449 -r eb98bbab31f4 ChangeLog

--- a/ChangeLog	Wed Dec 02 13:21:01 2009 +0100
+++ b/ChangeLog	Thu Dec 03 16:43:17 2009 +0100
@@ -1,14 +1,54 @@
+2009-12-03  Torbjorn Granlund  <tege at gmplib.org>
+
+	* tune/tuneup.c: Tune DC_DIVAPPR_Q_THRESHOLD.  Rewrite
+	DC_DIV_QR_THRESHOLD tuning code.
+	(tune_dc_div): Rewrite.
+	* tune/speed.h (SPEED_ROUTINE_MPN_PI1_DIV): New macro.
+	* tune/common.c (speed_mpn_sbpi1_div_qr, speed_mpn_dcpi1_div_qr,
+	speed_mpn_sbpi1_divappr_q, speed_mpn_sbpi1_bdiv_qr): New functions.
+	* gmp-impl.h: Provide declarations for corresponding threshold vars.
+	* tune/Makefile.am (TUNE_MPN_SRCS_BASIC): Add dcpi1_divappr_q.c.
+
+	* tune/tuneup.c (tune_binvert): Up max_size.
+
+2009-12-02  Marco Bodrato <bodrato at mail.dm.unipi.it>
+
+	* tests/devel/try.c: Test mpn_rsblsh2_n and mpn_rsblsh_n.
+        * tests/refmpn.c (refmpn_rsblsh_n, refmpn_rsblsh2_n): New functions.
+	(refmpn_rsblsh1_n): Use generic refmpn_rsblsh_n.
+        * tests/tests.h: Declare new functions.
+
+2009-12-03  Niels Möller  <nisse at lysator.liu.se>
+
+	* mpn/generic/gcdext_subdiv_step.c (mpn_gcdext_subdiv_step):
+	Select the right cofactor in the cases A == B or A == 2B.
+
+	* mpn/generic/gcdext_lehmer.c (mpn_gcdext_lehmer_n): Deleted
+	handling of ap[0] == 0 and bp[0] == 0; these cases don't happen.
+	Select the right cofactor in the case ap[0] == bp[0].
+	* mpn/generic/gcdext.c (mpn_gcdext): Analogous changes.
+
+2009-12-02  Niels Möller  <nisse at lysator.liu.se>
+
+	* gmp-h.in (mpn_gcdext_1): Updated prototype.
+	* mpn/generic/gcdext_lehmer.c (mpn_gcdext_lehmer_n): Updated for
+	signed cofactors from gcdext_1.
+	* mpn/generic/gcdext_1.c (mpn_gcdext_1): Use Euclid's algorithm,
+	and return signed cofactors.
+
 2009-12-02  Torbjorn Granlund  <tege at gmplib.org>
 
+	* doc/gmp.texi (Low-level Functions): Document mpn_sqr_n.
+
 	* tune/speed.c (routine): Add mpn_binvert.
 
 	* tune/tuneup.c: Tune BINV_NEWTON_THRESHOLD.
 	(tune_binvert): New function.
-	* tune/speed.h (SPEED_ROUTINE_MPN_BINVERT): New macros.
+	* tune/speed.h (SPEED_ROUTINE_MPN_BINVERT): New macro.
 	* tune/common.c (speed_mpn_binvert): New function.
 	* gmp-impl.h: Provide declarations for corresponding threshold vars.
 	* tune/Makefile.am (TUNE_MPN_SRCS_BASIC): Add binvert.c.
-	
+
 	* tune/tuneup.c: Tune DC_BDIV_QR_THRESHOLD and DC_BDIV_Q_THRESHOLD.
 	(tune_dc_bdiv): New function.
 	(tune_dc_div): New name for tune_dc.
@@ -19,7 +59,7 @@
 	* gmp-impl.h: Provide declarations for corresponding threshold vars.
 	* tune/Makefile.am (TUNE_MPN_SRCS_BASIC): Add dcpi1_bdiv_qr.c and
 	dcpi1_bdiv_q.c.
-	
+
 2009-12-01  Marco Bodrato <bodrato at mail.dm.unipi.it>
 
 	* mpn/generic/toom53_mul.c: Removed double computation of vinf.
diff -r ee9ad3199449 -r eb98bbab31f4 doc/gmp.texi
--- a/doc/gmp.texi	Wed Dec 02 13:21:01 2009 +0100
+++ b/doc/gmp.texi	Thu Dec 03 16:43:17 2009 +0100
@@ -5156,6 +5156,30 @@
 The destination has to have space for 2*@var{n} limbs, even if the product's
 most significant limb is zero.  No overlap is permitted between the
 destination and either source.
+
+If the two input operands are the same, use @code{mpn_sqr_n}.
+ at end deftypefun
+
+ at deftypefun mp_limb_t mpn_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
+Multiply @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
+(@var{s1n}+ at var{s2n})-limb result to @var{rp}.  Return the most significant
+limb of the result.
+
+The destination has to have space for @var{s1n} + @var{s2n} limbs, even if the
+product's most significant limb is zero.  No overlap is permitted between the
+destination and either source.
+
+This function requires that @var{s1n} is greater than or equal to
+ at var{s2n}.
+ at end deftypefun
+
+ at deftypefun void mpn_sqr_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n})
+Compute the square of @{@var{s1p}, @var{n}@} and write the 2*@var{n}-limb
+result to @var{rp}.
+
+The destination has to have space for 2*@var{n} limbs, even if the result's
+most significant limb is zero.  No overlap is permitted between the
+destination and the source.
 @end deftypefun
 
 @deftypefun mp_limb_t mpn_mul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
@@ -5194,17 +5218,6 @@
 in assembly for most CPUs.
 @end deftypefun
 
- at deftypefun mp_limb_t mpn_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
-Multiply @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
-result to @var{rp}.  Return the most significant limb of the result.
-
-The destination has to have space for @var{s1n} + @var{s2n} limbs, even if the
-result might be one limb smaller.
-
-This function requires that @var{s1n} is greater than or equal to
- at var{s2n}.  The destination must be distinct from both input operands.
- at end deftypefun
-
 @deftypefun void mpn_tdiv_qr (mp_limb_t *@var{qp}, mp_limb_t *@var{rp}, mp_size_t @var{qxn}, const mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn})
 Divide @{@var{np}, @var{nn}@} by @{@var{dp}, @var{dn}@} and put the quotient
 at @{@var{qp}, @var{nn}@minus{}@var{dn}+1@} and the remainder at @{@var{rp},
diff -r ee9ad3199449 -r eb98bbab31f4 gmp-h.in
--- a/gmp-h.in	Wed Dec 02 13:21:01 2009 +0100
+++ b/gmp-h.in	Thu Dec 03 16:43:17 2009 +0100
@@ -1542,7 +1542,7 @@
 __GMP_DECLSPEC mp_limb_t mpn_gcd_1 __GMP_PROTO ((mp_srcptr, mp_size_t, mp_limb_t)) __GMP_ATTRIBUTE_PURE;
 
 #define mpn_gcdext_1 __MPN(gcdext_1)
-__GMP_DECLSPEC mp_limb_t mpn_gcdext_1 __GMP_PROTO ((mp_ptr, mp_ptr, mp_limb_t, mp_limb_t));
+__GMP_DECLSPEC mp_limb_t mpn_gcdext_1 __GMP_PROTO ((mp_limb_signed_t *, mp_limb_signed_t *, mp_limb_t, mp_limb_t));
 
 #define mpn_gcdext __MPN(gcdext)
 __GMP_DECLSPEC mp_size_t mpn_gcdext __GMP_PROTO ((mp_ptr, mp_ptr, mp_size_t *, mp_ptr, mp_size_t, mp_ptr, mp_size_t));
diff -r ee9ad3199449 -r eb98bbab31f4 gmp-impl.h
--- a/gmp-impl.h	Wed Dec 02 13:21:01 2009 +0100
+++ b/gmp-impl.h	Thu Dec 03 16:43:17 2009 +0100
@@ -4146,6 +4146,14 @@
 #define DC_DIV_QR_THRESHOLD          dc_div_qr_threshold
 extern mp_size_t                     dc_div_qr_threshold;
 
+#undef  DC_DIVAPPR_Q_THRESHOLD
+#define DC_DIVAPPR_Q_THRESHOLD       dc_divappr_q_threshold
+extern mp_size_t                     dc_divappr_q_threshold;
+
+#undef  DC_DIV_Q_THRESHOLD
+#define DC_DIV_Q_THRESHOLD           dc_div_q_threshold
+extern mp_size_t                     dc_div_q_threshold;
+
 #undef  DC_BDIV_Q_THRESHOLD
 #define DC_BDIV_Q_THRESHOLD          dc_bdiv_q_threshold
 extern mp_size_t                     dc_bdiv_q_threshold;
diff -r ee9ad3199449 -r eb98bbab31f4 mpn/generic/gcdext.c
--- a/mpn/generic/gcdext.c	Wed Dec 02 13:21:01 2009 +0100
+++ b/mpn/generic/gcdext.c	Thu Dec 03 16:43:17 2009 +0100
@@ -1,6 +1,6 @@
 /* mpn_gcdext -- Extended Greatest Common Divisor.
 
-Copyright 1996, 1998, 2000, 2001, 2002, 2003, 2004, 2005, 2008 Free Software
+Copyright 1996, 1998, 2000, 2001, 2002, 2003, 2004, 2005, 2008, 2009 Free Software
 Foundation, Inc.
 
 This file is part of the GNU MP Library.
@@ -388,25 +388,29 @@
 	}
     }
 
-  if (mpn_zero_p (ap, n))
+  if (UNLIKELY (mpn_cmp (ap, bp, n) == 0))
     {
-      MPN_COPY (gp, bp, n);
-      MPN_NORMALIZE_NOT_ZERO (u0, un);
-      MPN_COPY (up, u0, un);
-      *usizep = -un;
+      /* Must return the smallest cofactor, +u1 or -u0 */
+      int c;
 
-      TMP_FREE;
-      return n;
-    }
-  else if (mpn_zero_p (bp, n))
-    {
       MPN_COPY (gp, ap, n);
-      MPN_NORMALIZE_NOT_ZERO (u1, un);
-      MPN_COPY (up, u1, un);
-      *usizep = un;
 
-      TMP_FREE;
-      return n;
+      MPN_CMP (c, u0, u1, un);
+      ASSERT (c != 0);
+      if (c < 0)
+	{
+	  MPN_NORMALIZE (u0, un);
+	  MPN_COPY (up, u0, un);
+	  *usizep = -un;
+	}
+      else
+	{
+	  MPN_NORMALIZE_NOT_ZERO (u1, un);
+	  MPN_COPY (up, u1, un);
+	  *usizep = un;
+	}
+
+      return n;      
     }
   else if (mpn_zero_p (u0, un))
     {
diff -r ee9ad3199449 -r eb98bbab31f4 mpn/generic/gcdext_1.c
--- a/mpn/generic/gcdext_1.c	Wed Dec 02 13:21:01 2009 +0100
+++ b/mpn/generic/gcdext_1.c	Thu Dec 03 16:43:17 2009 +0100
@@ -18,230 +18,34 @@
 You should have received a copy of the GNU Lesser General Public License
 along with the GNU MP Library.  If not, see http://www.gnu.org/licenses/.  */
 
-/* Default to binary gcdext_1, since it is best on most current machines.
-   We should teach tuneup to choose the right gcdext_1.  */
-#define GCDEXT_1_USE_BINARY 1
-
 #include "gmp.h"
 #include "gmp-impl.h"
 #include "longlong.h"
 
-#ifndef NULL
-# define NULL ((void *) 0)
-#endif
-
 /* FIXME: Takes two single-word limbs. It could be extended to a
  * function that accepts a bignum for the first input, and only
  * returns the first co-factor. */
 
-/* Returns g, u and v such that g = u A - v B. There are three
-   different cases for the result:
-
-     g = u A - v B, 0 < u < b, 0 < v < a
-     g = A          u = 1, v = 0
-     g = B          u = B, v = A - 1
-
-   We always return with 0 < u <= b, 0 <= v < a.
-*/
-#if GCDEXT_1_USE_BINARY
-
-static mp_limb_t
-gcdext_1_odd (mp_limb_t *up, mp_limb_t *vp, mp_limb_t a, mp_limb_t b)
-{
-  mp_limb_t u0;
-  mp_limb_t v0;
-  mp_limb_t v1;
-  mp_limb_t u1;
-
-  mp_limb_t B = b;
-  mp_limb_t A = a;
-
-  /* Through out this function maintain
-
-     a = u0 A - v0 B
-     b = u1 A - v1 B
-
-     where A and B are odd. */
-
-  u0 = 1; v0 = 0;
-  u1 = b; v1 = a-1;
-
-  if (A == 1)
-    {
-      *up = u0; *vp = v0;
-      return 1;
-    }
-  else if (B == 1)
-    {
-      *up = u1; *vp = v1;
-      return 1;
-    }
-
-  while (a != b)
-    {
-      mp_limb_t mask;
-
-      ASSERT (a % 2 == 1);
-      ASSERT (b % 2 == 1);
-
-      ASSERT (0 < u0); ASSERT (u0 <= B);
-      ASSERT (0 < u1); ASSERT (u1 <= B);
-
-      ASSERT (0 <= v0); ASSERT (v0 < A);
-      ASSERT (0 <= v1); ASSERT (v1 < A);
-
-      if (a > b)
-	{
-	  MP_LIMB_T_SWAP (a, b);
-	  MP_LIMB_T_SWAP (u0, u1);
-	  MP_LIMB_T_SWAP (v0, v1);
-	}
-
-      ASSERT (a < b);
-
-      /* Makes b even */
-      b -= a;
-
-      mask = - (mp_limb_t) (u1 < u0);
-      u1 += B & mask;