[Gmp-commit] /home/hgfiles/gmp: 9 new changesets

Sun Dec 6 07:29:41 CET 2009

details:   /home/hgfiles/gmp/rev/a7746bd33fcb
changeset: 12985:a7746bd33fcb
user:      Marco Bodrato <bodrato at mail.dm.unipi.it>
date:      Fri Dec 04 14:39:53 2009 +0100
description:
Change return value of toom_eval_ functions to 0 or ~0.

details:   /home/hgfiles/gmp/rev/1d026ed9f9bf
changeset: 12986:1d026ed9f9bf
user:      Marco Bodrato <bodrato at mail.dm.unipi.it>
date:      Fri Dec 04 14:51:22 2009 +0100
description:
Sign of ev(-1) passed as a flag.

details:   /home/hgfiles/gmp/rev/96283f0f5cde
changeset: 12987:96283f0f5cde
user:      Marco Bodrato <bodrato at mail.dm.unipi.it>
date:      Fri Dec 04 16:24:18 2009 +0100
description:
Removed some branches in Toom evaluations.

details:   /home/hgfiles/gmp/rev/4a96c687d809
changeset: 12988:4a96c687d809
user:      Marco Bodrato <bodrato at mail.dm.unipi.it>
date:      Fri Dec 04 22:10:37 2009 +0100
description:
Use toom_eval_ in toom52.

details:   /home/hgfiles/gmp/rev/cab4f99e5787
changeset: 12989:cab4f99e5787
user:      Marco Bodrato <bodrato at mail.dm.unipi.it>
date:      Sat Dec 05 00:07:53 2009 +0100
description:
Don't require C99.

details:   /home/hgfiles/gmp/rev/e2f13ae38a67
changeset: 12990:e2f13ae38a67
user:      Marco Bodrato <bodrato at mail.dm.unipi.it>
date:      Sat Dec 05 08:32:00 2009 +0100
description:
Rewrote mpn_toom_eval_pm2 to use addlsh2, with logic suggested by Niels.

details:   /home/hgfiles/gmp/rev/2ae28590bc4f
changeset: 12991:2ae28590bc4f
user:      Marco Bodrato <bodrato at mail.dm.unipi.it>
date:      Sat Dec 05 10:59:20 2009 +0100
description:
Assert and comment updated

details:   /home/hgfiles/gmp/rev/7117e89d941e
changeset: 12992:7117e89d941e
user:      Marco Bodrato <bodrato at mail.dm.unipi.it>
date:      Sun Dec 06 07:15:24 2009 +0100
description:
Added comments.

details:   /home/hgfiles/gmp/rev/c183cbef46c6
changeset: 12993:c183cbef46c6
user:      Marco Bodrato <bodrato at mail.dm.unipi.it>
date:      Sun Dec 06 07:28:53 2009 +0100
description:
Merge

diffstat:

 ChangeLog                           |   57 +++++++++++++++++-
 configure.in                        |   25 ++++++--
 mpn/generic/mod_1_1.c               |    2 +-
 mpn/generic/powm.c                  |   15 ++---
 mpn/generic/powm_sec.c              |  124 ++++++++++++++++++++++++++++++++--------
 mpn/generic/redc_1.c                |    7 +-
 mpn/generic/redc_1_sec.c            |   45 +++++++++++++++
 mpn/generic/redc_2.c                |    1 +
 mpn/generic/sbpi1_bdiv_q.c          |   39 ++++++------
 mpn/generic/toom33_mul.c            |    4 +-
 mpn/generic/toom3_sqr.c             |    2 +-
 mpn/generic/toom42_mul.c            |    4 +-
 mpn/generic/toom43_mul.c            |    8 +-
 mpn/generic/toom44_mul.c            |   20 ++----
 mpn/generic/toom4_sqr.c             |    3 +-
 mpn/generic/toom52_mul.c            |   72 ++----------------------
 mpn/generic/toom53_mul.c            |   18 ++---
 mpn/generic/toom62_mul.c            |   15 ++---
 mpn/generic/toom_eval_dgr3_pm1.c    |    3 +-
 mpn/generic/toom_eval_dgr3_pm2.c    |    3 +-
 mpn/generic/toom_eval_pm1.c         |    2 +-
 mpn/generic/toom_eval_pm2.c         |  104 ++++++++++++++++++----------------
 mpn/generic/toom_eval_pm2exp.c      |    2 +-
 mpn/generic/toom_interpolate_5pts.c |   18 +++--
 mpn/generic/toom_interpolate_7pts.c |    3 +-
 tune/speed.h                        |    4 +
 tune/tuneup.c                       |    6 +-
 27 files changed, 365 insertions(+), 241 deletions(-)

diffs (truncated from 1212 to 300 lines):

diff -r a60282a2d7c3 -r c183cbef46c6 ChangeLog

--- a/ChangeLog	Thu Dec 03 22:50:45 2009 +0100
+++ b/ChangeLog	Sun Dec 06 07:28:53 2009 +0100
@@ -1,10 +1,57 @@
-2009-12-03  Torbjorn Granlund  <tege at gmplib.org>
-
-	* configure.in: Move intptr_t test into common AC_CHECK_TYPES.
+2009-12-05  Marco Bodrato <bodrato at mail.dm.unipi.it>
+
+	* mpn/generic/toom_eval_dgr3_pm1.c: Change return value: 0 or ~0.
+	* mpn/generic/toom_eval_dgr3_pm2.c: Likewise.
+	* mpn/generic/toom_eval_pm1.c: Likewise.
+	* mpn/generic/toom_eval_pm2exp.c: Likewise.
+	* mpn/generic/toom_eval_pm2.c: Rewrite to use mpn_addlsh2_n.
+
+	* mpn/generic/toom_interpolate_5pts.c: Param sa is a flag, not a sign.
+
+	* mpn/generic/toom33_mul.c: Adapt to changes above.
+	* mpn/generic/toom3_sqr.c: Likewise.
+	* mpn/generic/toom42_mul.c: Likewise.
+	* mpn/generic/toom43_mul.c: Reduce branches.
+	* mpn/generic/toom44_mul.c: Likewise.
+	* mpn/generic/toom53_mul.c: Likewise.
+	* mpn/generic/toom62_mul.c: Likewise.
+
+	* mpn/generic/toom52_mul.c: Use toom_eval_ functions.
+
+	* mpn/generic/toom4_sqr.c: Avoid C99 construct.
+	* mpn/generic/toom_interpolate_7pts.c: Likewise.
+
+2009-12-05  Torbjorn Granlund  <tege at gmplib.org>
+
+	* mpn/generic/redc_1_sec.c: New file.
+	* mpn/generic/powm_sec.c: Use redc_1_sec.  Use dummy full subtract
+	instead of mpn_cmp since the latter leaks to the side channel.
+	(mpn_local_sqr_n): New function, with associated macros.
+	(mpn_powm_sec): Use mpn_local_sqr_n.
+
+	* configure.in (HAVE_NATIVE): Add missing functions, then sort.
+
+2009-12-04  Torbjorn Granlund  <tege at gmplib.org>
+
+	* tune/tuneup.c (tune_dc_div): Up min_size to 6.
+	(tune_mod_1): Set MOD_1_1_THRESHOLD min_size to 2.
+
+	* tune/speed.h: Negate "binvert"-type inverses, as required.
+
+	* mpn/generic/redc_1.c: Add ASSERTs.
+	* mpn/generic/redc_2.c: Likewise.
+
+	* mpn/generic/sbpi1_bdiv_q.c: Simplify loops, indexing.
+
+2009-12-03  Yann Droneaud  <yann at droneaud.fr>
 
 	* acinclude.m4 ([long long reliability test 1]): Add a "static" for C99
 	inline semantics compatibility.
 
+2009-12-03  Torbjorn Granlund  <tege at gmplib.org>
+
+	* configure.in: Move intptr_t test into common AC_CHECK_TYPES.
+
 	* mpn/generic/gcdext.c: Add a TMP_FREE.
 
 2009-12-03  Niels Möller  <nisse at lysator.liu.se>
@@ -30,9 +77,9 @@
 2009-12-02  Marco Bodrato <bodrato at mail.dm.unipi.it>
 
 	* tests/devel/try.c: Test mpn_rsblsh2_n and mpn_rsblsh_n.
-        * tests/refmpn.c (refmpn_rsblsh_n, refmpn_rsblsh2_n): New functions.
+	* tests/refmpn.c (refmpn_rsblsh_n, refmpn_rsblsh2_n): New functions.
 	(refmpn_rsblsh1_n): Use generic refmpn_rsblsh_n.
-        * tests/tests.h: Declare new functions.
+	* tests/tests.h: Declare new functions.
 
 2009-12-03  Niels Möller  <nisse at lysator.liu.se>
 
diff -r a60282a2d7c3 -r c183cbef46c6 configure.in
--- a/configure.in	Thu Dec 03 22:50:45 2009 +0100
+++ b/configure.in	Sun Dec 06 07:28:53 2009 +0100
@@ -2949,10 +2949,12 @@
 [/* Define to 1 each of the following for which a native (ie. CPU specific)
     implementation of the corresponding routine exists.  */
 #undef HAVE_NATIVE_mpn_add_n
+#undef HAVE_NATIVE_mpn_add_n_sub_n
 #undef HAVE_NATIVE_mpn_add_nc
-#undef HAVE_NATIVE_mpn_addlsh_n
+#undef HAVE_NATIVE_mpn_addaddmul_1msb0
 #undef HAVE_NATIVE_mpn_addlsh1_n
 #undef HAVE_NATIVE_mpn_addlsh2_n
+#undef HAVE_NATIVE_mpn_addlsh_n
 #undef HAVE_NATIVE_mpn_addmul_1c
 #undef HAVE_NATIVE_mpn_addmul_2
 #undef HAVE_NATIVE_mpn_addmul_3
@@ -2961,8 +2963,6 @@
 #undef HAVE_NATIVE_mpn_addmul_6
 #undef HAVE_NATIVE_mpn_addmul_7
 #undef HAVE_NATIVE_mpn_addmul_8
-#undef HAVE_NATIVE_mpn_add_n_sub_n
-#undef HAVE_NATIVE_mpn_addaddmul_1msb0
 #undef HAVE_NATIVE_mpn_and_n
 #undef HAVE_NATIVE_mpn_andn_n
 #undef HAVE_NATIVE_mpn_bdiv_dbm1c
@@ -2977,41 +2977,52 @@
 #undef HAVE_NATIVE_mpn_divrem_1c
 #undef HAVE_NATIVE_mpn_divrem_2
 #undef HAVE_NATIVE_mpn_gcd_1
+#undef HAVE_NATIVE_mpn_hamdist
 #undef HAVE_NATIVE_mpn_invert_limb
 #undef HAVE_NATIVE_mpn_ior_n
 #undef HAVE_NATIVE_mpn_iorn_n
+#undef HAVE_NATIVE_mpn_lshift
 #undef HAVE_NATIVE_mpn_lshiftc
+#undef HAVE_NATIVE_mpn_lshsub_n
 #undef HAVE_NATIVE_mpn_mod_1
+#undef HAVE_NATIVE_mpn_mod_1_1p
 #undef HAVE_NATIVE_mpn_mod_1c
+#undef HAVE_NATIVE_mpn_mod_1s_2p
+#undef HAVE_NATIVE_mpn_mod_1s_4p
+#undef HAVE_NATIVE_mpn_mod_34lsub1
 #undef HAVE_NATIVE_mpn_modexact_1_odd
 #undef HAVE_NATIVE_mpn_modexact_1c_odd
+#undef HAVE_NATIVE_mpn_mul_1
 #undef HAVE_NATIVE_mpn_mul_1c
 #undef HAVE_NATIVE_mpn_mul_2
 #undef HAVE_NATIVE_mpn_mul_3
 #undef HAVE_NATIVE_mpn_mul_4
+#undef HAVE_NATIVE_mpn_mul_basecase
 #undef HAVE_NATIVE_mpn_nand_n
 #undef HAVE_NATIVE_mpn_nior_n
+#undef HAVE_NATIVE_mpn_popcount
 #undef HAVE_NATIVE_mpn_preinv_divrem_1
 #undef HAVE_NATIVE_mpn_preinv_mod_1
 #undef HAVE_NATIVE_mpn_redc_1
 #undef HAVE_NATIVE_mpn_redc_2
-#undef HAVE_NATIVE_mpn_rsblsh_n
 #undef HAVE_NATIVE_mpn_rsblsh1_n
 #undef HAVE_NATIVE_mpn_rsblsh2_n
+#undef HAVE_NATIVE_mpn_rsblsh_n
 #undef HAVE_NATIVE_mpn_rsh1add_n
 #undef HAVE_NATIVE_mpn_rsh1sub_n
+#undef HAVE_NATIVE_mpn_rshift
 #undef HAVE_NATIVE_mpn_sqr_basecase
 #undef HAVE_NATIVE_mpn_sqr_diagonal
 #undef HAVE_NATIVE_mpn_sub_n
 #undef HAVE_NATIVE_mpn_sub_nc
-#undef HAVE_NATIVE_mpn_sublsh_n
 #undef HAVE_NATIVE_mpn_sublsh1_n
 #undef HAVE_NATIVE_mpn_sublsh2_n
+#undef HAVE_NATIVE_mpn_sublsh_n
 #undef HAVE_NATIVE_mpn_submul_1c
+#undef HAVE_NATIVE_mpn_udiv_qrnnd
+#undef HAVE_NATIVE_mpn_udiv_qrnnd_r
 #undef HAVE_NATIVE_mpn_umul_ppmm
 #undef HAVE_NATIVE_mpn_umul_ppmm_r
-#undef HAVE_NATIVE_mpn_udiv_qrnnd
-#undef HAVE_NATIVE_mpn_udiv_qrnnd_r
 #undef HAVE_NATIVE_mpn_xor_n
 #undef HAVE_NATIVE_mpn_xnor_n])
 
diff -r a60282a2d7c3 -r c183cbef46c6 mpn/generic/mod_1_1.c
--- a/mpn/generic/mod_1_1.c	Thu Dec 03 22:50:45 2009 +0100
+++ b/mpn/generic/mod_1_1.c	Sun Dec 06 07:28:53 2009 +0100
@@ -63,7 +63,7 @@
   int cnt;
   mp_limb_t mask;
 
-  ASSERT (n >= 2);
+  ASSERT (n >= 2);		/* fix tuneup.c if this is changed */
 
   B1modb = bmodb[2];
   B2modb = bmodb[3];
diff -r a60282a2d7c3 -r c183cbef46c6 mpn/generic/powm.c
--- a/mpn/generic/powm.c	Thu Dec 03 22:50:45 2009 +0100
+++ b/mpn/generic/powm.c	Sun Dec 06 07:28:53 2009 +0100
@@ -29,14 +29,11 @@
 
   1. W <- U
 
-  2. While W^2 < M (and there are more bits in E)
-       W <- power left-to-right base-2 without reduction
+  2. T <- (B^n * U) mod M                Convert to REDC form
 
-  3. T <- (B^n * U) mod M                Convert to REDC form
+  3. Compute table U^1, U^3, U^5... of E-dependent size
 
-  4. Compute power table of E-dependent size
-
-  5. While there are more bits in E
+  4. While there are more bits in E
        W <- power left-to-right base-k
 
 
@@ -50,7 +47,7 @@
 
    * Choose window size without looping.  (Superoptimize or think(tm).)
 
-   * How do we handle small bases?
+   * Handle small bases with initial, reduction-free exponentiation.
 
    * Call new division functions, not mpn_tdiv_qr.
 
@@ -66,8 +63,8 @@
    * When U (the base) is small, we should start the exponentiation with plain
      operations, then convert that partial result to REDC form.
 
-   * But when U is just one limb, should be treated without the k-ary tricks.
-     We should keep a factor of B^n in W, but use U' = BU as base.  After
+   * When U is just one limb, should it be handled without the k-ary tricks?
+     We could keep a factor of B^n in W, but use U' = BU as base.  After
      multiplying by this (pseudo two-limb) number, we need to multiply by 1/B
      mod M.
 */
diff -r a60282a2d7c3 -r c183cbef46c6 mpn/generic/powm_sec.c
--- a/mpn/generic/powm_sec.c	Thu Dec 03 22:50:45 2009 +0100
+++ b/mpn/generic/powm_sec.c	Sun Dec 06 07:28:53 2009 +0100
@@ -1,4 +1,5 @@
-/* mpn_powm_sec -- Compute R = U^E mod M.  Safe variant, not leaking time info.
+/* mpn_powm_sec -- Compute R = U^E mod M.  Sacure variant, side-channel silent
+   under the assupmtion that the multiply instruction is side channel silent.
 
 Copyright 2007, 2008, 2009 Free Software Foundation, Inc.
 
@@ -19,19 +20,14 @@
 
 
 /*
-  BASIC ALGORITHM, Compute b^e mod n, where n is odd.
+  BASIC ALGORITHM, Compute U^E mod M, where M < B^n is odd.
 
-  1. w <- b
+  1. T <- (B^n * U) mod M                Convert to REDC form
 
-  2. While w^2 < n (and there are more bits in e)
-       w <- power left-to-right base-2 without reduction
+  2. Compute table U^0, U^1, U^2... of E-dependent size
 
-  3. t <- (B^n * b) / n                Convert to REDC form
-
-  4. Compute power table of e-dependent size
-
-  5. While there are more bits in e
-       w <- power left-to-right base-k with reduction
+  3. While there are more bits in E
+       W <- power left-to-right base-k
 
 
   TODO:
@@ -44,15 +40,7 @@
 
    * Choose window size without looping.  (Superoptimize or think(tm).)
 
-   * Make it sub-quadratic.
-
    * Call new division functions, not mpn_tdiv_qr.
-
-   * Is redc obsolete with improved SB division?
-
-   * Consider special code for one-limb M.
-
-   * Handle even M (in mpz_powm_sec) with two modexps and CRT.
 */
 
 #include "gmp.h"
@@ -62,6 +50,89 @@
 #define WANT_CACHE_SECURITY 1
 
 
+/* Define our own mpn squaring function.  We do this since we cannot use a
+   native mpn_sqr_basecase over TUNE_SQR_TOOM2_MAX, or a non-native one over
+   SQR_TOOM2_THRESHOLD.  This is so because of fixed size stack allocations
+   made inside mpn_sqr_basecase.  */
+
+#if HAVE_NATIVE_mpn_sqr_diagonal
+#define MPN_SQR_DIAGONAL(rp, up, n)					\
+  mpn_sqr_diagonal (rp, up, n)
+#else
+#define MPN_SQR_DIAGONAL(rp, up, n)					\
+  do {									\
+    mp_size_t _i;							\
+    for (_i = 0; _i < (n); _i++)					\
+      {									\
+	mp_limb_t ul, lpl;						\
+	ul = (up)[_i];							\
+	umul_ppmm ((rp)[2 * _i + 1], lpl, ul, ul << GMP_NAIL_BITS);	\
+	(rp)[2 * _i] = lpl >> GMP_NAIL_BITS;				\
+      }									\
+  } while (0)
+#endif
+
+#if HAVE_NATIVE_mpn_sqr_basecase
+#define BASECASE_LIMIT TUNE_SQR_TOOM2_MAX
+#else
+#define BASECASE_LIMIT SQR_TOOM2_THRESHOLD
+#endif
+
+static void
+mpn_local_sqr_n (mp_ptr rp, mp_srcptr up, mp_size_t n)
+{
+  mp_size_t i;
+