Shared toom evaluation functions

Torbjorn Granlund tg at
Sat Nov 14 19:18:19 CET 2009

nisse at (Niels Möller) writes:

  bodrato at writes:
  > My mpn_toom_ev_lsh can be used for your mpn_toom_eval_pm2, but it can
  > evaluate also _pm4, or _pm8 as needed by higer degree Toom.
  Makes sense! I thought _pm2 only needed mpn_addlsh1_n (or falls back
  to separate shift and add), but it actually uses the more general
  mpn_addlsh_n. (Don't know which platforms actually have these
Several have mpn_addlsh1_n, and they run up to 2x faster than separate
lshift and add_n.  (Same goes for sub.)  No machine or almost no has
mpn_addlsh_n, since it has proven tricky to make fast.

We should use mpn_addlsh1_n in more places I think, even for the s,t
related computations, such as pm2 in toom72.  That will be a bit tricky,
and will require a compare that cannot use mpn_cmp (but typically these
compares neeed to look at just one limb paot).

(I am enabling a missed trivial case of mpn_addlsh1_n in toom52_mul.)

  PS. Speaking of combination functions available only on some
  platforms, mpn_add_n_sub_n code seems to not be well tested, the
  toom52 in the tree contains the following
  #if HAVE_NATIVE_mpn_add_n_sub_n
    if (mpn_cmp (a0a2, a1a3, n+1) < 0)
        mpn_add_n_sub_n (as2, asm2, a1a3, a0a2, n+1);
        flags ^= toom6_vm1_neg;
        mpn_add_n_sub_n (as2, asm2, a0a2, a1a3, n+1);
    mpn_add_n (as2, a0a2, a1a3, n+1);
    if (mpn_cmp (a0a2, a1a3, n+1) < 0)
        mpn_sub_n (asm2, a1a3, a0a2, n+1);
        flags ^= toom6_vm2_neg;
        mpn_sub_n (asm2, a0a2, a1a3, n+1);
This seems to have been forgotten.  I fixed it now.  


More information about the gmp-devel mailing list