[Gmp-commit] /home/hgfiles/gmp: 6 new changesets
mercurial at gmplib.org
mercurial at gmplib.org
Thu Dec 3 16:43:23 CET 2009
details: /home/hgfiles/gmp/rev/fcf35df93482
changeset: 12965:fcf35df93482
user: Niels M?ller <nisse at lysator.liu.se>
date: Wed Dec 02 15:40:34 2009 +0100
description:
Signed cofactors from gcdext_1. Updated gcdext_lehmer.
details: /home/hgfiles/gmp/rev/b23660384f7c
changeset: 12966:b23660384f7c
user: Niels M?ller <nisse at lysator.liu.se>
date: Wed Dec 02 16:53:33 2009 +0100
description:
Require gcdext to return canonical cofactors.
details: /home/hgfiles/gmp/rev/bdbe1b36c5d9
changeset: 12967:bdbe1b36c5d9
user: Niels M?ller <nisse at lysator.liu.se>
date: Thu Dec 03 12:10:38 2009 +0100
description:
Better cofactor selection when we find that A == B.
details: /home/hgfiles/gmp/rev/12704b10a1b8
changeset: 12968:12704b10a1b8
user: Niels M?ller <nisse at lysator.liu.se>
date: Thu Dec 03 12:33:03 2009 +0100
description:
Better cofactor selection in outer gcdext loop.
details: /home/hgfiles/gmp/rev/97fe8f1135eb
changeset: 12969:97fe8f1135eb
user: Niels M?ller <nisse at lysator.liu.se>
date: Thu Dec 03 12:43:13 2009 +0100
description:
Comment and (c) year updates.
details: /home/hgfiles/gmp/rev/eb98bbab31f4
changeset: 12970:eb98bbab31f4
user: Niels M?ller <nisse at lysator.liu.se>
date: Thu Dec 03 16:43:17 2009 +0100
description:
Merged code for better canonicalization of gcdext cofactors.
diffstat:
ChangeLog | 46 ++++++-
doc/gmp.texi | 35 +++-
gmp-h.in | 2 +-
gmp-impl.h | 8 +
mpn/generic/gcdext.c | 36 ++--
mpn/generic/gcdext_1.c | 276 ++-------------------------------------
mpn/generic/gcdext_lehmer.c | 72 +++++++--
mpn/generic/gcdext_subdiv_step.c | 29 +++-
tests/devel/try.c | 21 +++
tests/mpz/t-gcd.c | 42 ++++-
tests/refmpn.c | 20 ++-
tests/tests.h | 2 +
tune/Makefile.am | 2 +-
tune/common.c | 22 +++
tune/speed.h | 49 ++++++-
tune/tuneup.c | 36 +++-
16 files changed, 360 insertions(+), 338 deletions(-)
diffs (truncated from 1082 to 300 lines):
diff -r ee9ad3199449 -r eb98bbab31f4 ChangeLog
--- a/ChangeLog Wed Dec 02 13:21:01 2009 +0100
+++ b/ChangeLog Thu Dec 03 16:43:17 2009 +0100
@@ -1,14 +1,54 @@
+2009-12-03 Torbjorn Granlund <tege at gmplib.org>
+
+ * tune/tuneup.c: Tune DC_DIVAPPR_Q_THRESHOLD. Rewrite
+ DC_DIV_QR_THRESHOLD tuning code.
+ (tune_dc_div): Rewrite.
+ * tune/speed.h (SPEED_ROUTINE_MPN_PI1_DIV): New macro.
+ * tune/common.c (speed_mpn_sbpi1_div_qr, speed_mpn_dcpi1_div_qr,
+ speed_mpn_sbpi1_divappr_q, speed_mpn_sbpi1_bdiv_qr): New functions.
+ * gmp-impl.h: Provide declarations for corresponding threshold vars.
+ * tune/Makefile.am (TUNE_MPN_SRCS_BASIC): Add dcpi1_divappr_q.c.
+
+ * tune/tuneup.c (tune_binvert): Up max_size.
+
+2009-12-02 Marco Bodrato <bodrato at mail.dm.unipi.it>
+
+ * tests/devel/try.c: Test mpn_rsblsh2_n and mpn_rsblsh_n.
+ * tests/refmpn.c (refmpn_rsblsh_n, refmpn_rsblsh2_n): New functions.
+ (refmpn_rsblsh1_n): Use generic refmpn_rsblsh_n.
+ * tests/tests.h: Declare new functions.
+
+2009-12-03 Niels Möller <nisse at lysator.liu.se>
+
+ * mpn/generic/gcdext_subdiv_step.c (mpn_gcdext_subdiv_step):
+ Select the right cofactor in the cases A == B or A == 2B.
+
+ * mpn/generic/gcdext_lehmer.c (mpn_gcdext_lehmer_n): Deleted
+ handling of ap[0] == 0 and bp[0] == 0; these cases don't happen.
+ Select the right cofactor in the case ap[0] == bp[0].
+ * mpn/generic/gcdext.c (mpn_gcdext): Analogous changes.
+
+2009-12-02 Niels Möller <nisse at lysator.liu.se>
+
+ * gmp-h.in (mpn_gcdext_1): Updated prototype.
+ * mpn/generic/gcdext_lehmer.c (mpn_gcdext_lehmer_n): Updated for
+ signed cofactors from gcdext_1.
+ * mpn/generic/gcdext_1.c (mpn_gcdext_1): Use Euclid's algorithm,
+ and return signed cofactors.
+
2009-12-02 Torbjorn Granlund <tege at gmplib.org>
+ * doc/gmp.texi (Low-level Functions): Document mpn_sqr_n.
+
* tune/speed.c (routine): Add mpn_binvert.
* tune/tuneup.c: Tune BINV_NEWTON_THRESHOLD.
(tune_binvert): New function.
- * tune/speed.h (SPEED_ROUTINE_MPN_BINVERT): New macros.
+ * tune/speed.h (SPEED_ROUTINE_MPN_BINVERT): New macro.
* tune/common.c (speed_mpn_binvert): New function.
* gmp-impl.h: Provide declarations for corresponding threshold vars.
* tune/Makefile.am (TUNE_MPN_SRCS_BASIC): Add binvert.c.
-
+
* tune/tuneup.c: Tune DC_BDIV_QR_THRESHOLD and DC_BDIV_Q_THRESHOLD.
(tune_dc_bdiv): New function.
(tune_dc_div): New name for tune_dc.
@@ -19,7 +59,7 @@
* gmp-impl.h: Provide declarations for corresponding threshold vars.
* tune/Makefile.am (TUNE_MPN_SRCS_BASIC): Add dcpi1_bdiv_qr.c and
dcpi1_bdiv_q.c.
-
+
2009-12-01 Marco Bodrato <bodrato at mail.dm.unipi.it>
* mpn/generic/toom53_mul.c: Removed double computation of vinf.
diff -r ee9ad3199449 -r eb98bbab31f4 doc/gmp.texi
--- a/doc/gmp.texi Wed Dec 02 13:21:01 2009 +0100
+++ b/doc/gmp.texi Thu Dec 03 16:43:17 2009 +0100
@@ -5156,6 +5156,30 @@
The destination has to have space for 2*@var{n} limbs, even if the product's
most significant limb is zero. No overlap is permitted between the
destination and either source.
+
+If the two input operands are the same, use @code{mpn_sqr_n}.
+ at end deftypefun
+
+ at deftypefun mp_limb_t mpn_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
+Multiply @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
+(@var{s1n}+ at var{s2n})-limb result to @var{rp}. Return the most significant
+limb of the result.
+
+The destination has to have space for @var{s1n} + @var{s2n} limbs, even if the
+product's most significant limb is zero. No overlap is permitted between the
+destination and either source.
+
+This function requires that @var{s1n} is greater than or equal to
+ at var{s2n}.
+ at end deftypefun
+
+ at deftypefun void mpn_sqr_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n})
+Compute the square of @{@var{s1p}, @var{n}@} and write the 2*@var{n}-limb
+result to @var{rp}.
+
+The destination has to have space for 2*@var{n} limbs, even if the result's
+most significant limb is zero. No overlap is permitted between the
+destination and the source.
@end deftypefun
@deftypefun mp_limb_t mpn_mul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb})
@@ -5194,17 +5218,6 @@
in assembly for most CPUs.
@end deftypefun
- at deftypefun mp_limb_t mpn_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n})
-Multiply @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the
-result to @var{rp}. Return the most significant limb of the result.
-
-The destination has to have space for @var{s1n} + @var{s2n} limbs, even if the
-result might be one limb smaller.
-
-This function requires that @var{s1n} is greater than or equal to
- at var{s2n}. The destination must be distinct from both input operands.
- at end deftypefun
-
@deftypefun void mpn_tdiv_qr (mp_limb_t *@var{qp}, mp_limb_t *@var{rp}, mp_size_t @var{qxn}, const mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn})
Divide @{@var{np}, @var{nn}@} by @{@var{dp}, @var{dn}@} and put the quotient
at @{@var{qp}, @var{nn}@minus{}@var{dn}+1@} and the remainder at @{@var{rp},
diff -r ee9ad3199449 -r eb98bbab31f4 gmp-h.in
--- a/gmp-h.in Wed Dec 02 13:21:01 2009 +0100
+++ b/gmp-h.in Thu Dec 03 16:43:17 2009 +0100
@@ -1542,7 +1542,7 @@
__GMP_DECLSPEC mp_limb_t mpn_gcd_1 __GMP_PROTO ((mp_srcptr, mp_size_t, mp_limb_t)) __GMP_ATTRIBUTE_PURE;
#define mpn_gcdext_1 __MPN(gcdext_1)
-__GMP_DECLSPEC mp_limb_t mpn_gcdext_1 __GMP_PROTO ((mp_ptr, mp_ptr, mp_limb_t, mp_limb_t));
+__GMP_DECLSPEC mp_limb_t mpn_gcdext_1 __GMP_PROTO ((mp_limb_signed_t *, mp_limb_signed_t *, mp_limb_t, mp_limb_t));
#define mpn_gcdext __MPN(gcdext)
__GMP_DECLSPEC mp_size_t mpn_gcdext __GMP_PROTO ((mp_ptr, mp_ptr, mp_size_t *, mp_ptr, mp_size_t, mp_ptr, mp_size_t));
diff -r ee9ad3199449 -r eb98bbab31f4 gmp-impl.h
--- a/gmp-impl.h Wed Dec 02 13:21:01 2009 +0100
+++ b/gmp-impl.h Thu Dec 03 16:43:17 2009 +0100
@@ -4146,6 +4146,14 @@
#define DC_DIV_QR_THRESHOLD dc_div_qr_threshold
extern mp_size_t dc_div_qr_threshold;
+#undef DC_DIVAPPR_Q_THRESHOLD
+#define DC_DIVAPPR_Q_THRESHOLD dc_divappr_q_threshold
+extern mp_size_t dc_divappr_q_threshold;
+
+#undef DC_DIV_Q_THRESHOLD
+#define DC_DIV_Q_THRESHOLD dc_div_q_threshold
+extern mp_size_t dc_div_q_threshold;
+
#undef DC_BDIV_Q_THRESHOLD
#define DC_BDIV_Q_THRESHOLD dc_bdiv_q_threshold
extern mp_size_t dc_bdiv_q_threshold;
diff -r ee9ad3199449 -r eb98bbab31f4 mpn/generic/gcdext.c
--- a/mpn/generic/gcdext.c Wed Dec 02 13:21:01 2009 +0100
+++ b/mpn/generic/gcdext.c Thu Dec 03 16:43:17 2009 +0100
@@ -1,6 +1,6 @@
/* mpn_gcdext -- Extended Greatest Common Divisor.
-Copyright 1996, 1998, 2000, 2001, 2002, 2003, 2004, 2005, 2008 Free Software
+Copyright 1996, 1998, 2000, 2001, 2002, 2003, 2004, 2005, 2008, 2009 Free Software
Foundation, Inc.
This file is part of the GNU MP Library.
@@ -388,25 +388,29 @@
}
}
- if (mpn_zero_p (ap, n))
+ if (UNLIKELY (mpn_cmp (ap, bp, n) == 0))
{
- MPN_COPY (gp, bp, n);
- MPN_NORMALIZE_NOT_ZERO (u0, un);
- MPN_COPY (up, u0, un);
- *usizep = -un;
+ /* Must return the smallest cofactor, +u1 or -u0 */
+ int c;
- TMP_FREE;
- return n;
- }
- else if (mpn_zero_p (bp, n))
- {
MPN_COPY (gp, ap, n);
- MPN_NORMALIZE_NOT_ZERO (u1, un);
- MPN_COPY (up, u1, un);
- *usizep = un;
- TMP_FREE;
- return n;
+ MPN_CMP (c, u0, u1, un);
+ ASSERT (c != 0);
+ if (c < 0)
+ {
+ MPN_NORMALIZE (u0, un);
+ MPN_COPY (up, u0, un);
+ *usizep = -un;
+ }
+ else
+ {
+ MPN_NORMALIZE_NOT_ZERO (u1, un);
+ MPN_COPY (up, u1, un);
+ *usizep = un;
+ }
+
+ return n;
}
else if (mpn_zero_p (u0, un))
{
diff -r ee9ad3199449 -r eb98bbab31f4 mpn/generic/gcdext_1.c
--- a/mpn/generic/gcdext_1.c Wed Dec 02 13:21:01 2009 +0100
+++ b/mpn/generic/gcdext_1.c Thu Dec 03 16:43:17 2009 +0100
@@ -18,230 +18,34 @@
You should have received a copy of the GNU Lesser General Public License
along with the GNU MP Library. If not, see http://www.gnu.org/licenses/. */
-/* Default to binary gcdext_1, since it is best on most current machines.
- We should teach tuneup to choose the right gcdext_1. */
-#define GCDEXT_1_USE_BINARY 1
-
#include "gmp.h"
#include "gmp-impl.h"
#include "longlong.h"
-#ifndef NULL
-# define NULL ((void *) 0)
-#endif
-
/* FIXME: Takes two single-word limbs. It could be extended to a
* function that accepts a bignum for the first input, and only
* returns the first co-factor. */
-/* Returns g, u and v such that g = u A - v B. There are three
- different cases for the result:
-
- g = u A - v B, 0 < u < b, 0 < v < a
- g = A u = 1, v = 0
- g = B u = B, v = A - 1
-
- We always return with 0 < u <= b, 0 <= v < a.
-*/
-#if GCDEXT_1_USE_BINARY
-
-static mp_limb_t
-gcdext_1_odd (mp_limb_t *up, mp_limb_t *vp, mp_limb_t a, mp_limb_t b)
-{
- mp_limb_t u0;
- mp_limb_t v0;
- mp_limb_t v1;
- mp_limb_t u1;
-
- mp_limb_t B = b;
- mp_limb_t A = a;
-
- /* Through out this function maintain
-
- a = u0 A - v0 B
- b = u1 A - v1 B
-
- where A and B are odd. */
-
- u0 = 1; v0 = 0;
- u1 = b; v1 = a-1;
-
- if (A == 1)
- {
- *up = u0; *vp = v0;
- return 1;
- }
- else if (B == 1)
- {
- *up = u1; *vp = v1;
- return 1;
- }
-
- while (a != b)
- {
- mp_limb_t mask;
-
- ASSERT (a % 2 == 1);
- ASSERT (b % 2 == 1);
-
- ASSERT (0 < u0); ASSERT (u0 <= B);
- ASSERT (0 < u1); ASSERT (u1 <= B);
-
- ASSERT (0 <= v0); ASSERT (v0 < A);
- ASSERT (0 <= v1); ASSERT (v1 < A);
-
- if (a > b)
- {
- MP_LIMB_T_SWAP (a, b);
- MP_LIMB_T_SWAP (u0, u1);
- MP_LIMB_T_SWAP (v0, v1);
- }
-
- ASSERT (a < b);
-
- /* Makes b even */
- b -= a;
-
- mask = - (mp_limb_t) (u1 < u0);
- u1 += B & mask;
More information about the gmp-commit
mailing list