Shared toom evaluation functions

Thu Oct 29 16:54:26 CET 2009

bodrato at mail.dm.unipi.it writes:

> My mpn_toom_ev_lsh can be used for your mpn_toom_eval_pm2, but it can
> evaluate also _pm4, or _pm8 as needed by higer degree Toom.

Makes sense! I thought _pm2 only needed mpn_addlsh1_n (or falls back
to separate shift and add), but it actually uses the more general
mpn_addlsh_n. (Don't know which platforms actually have these
functions).

It would be nice to ba able to use the same function for ±2 and ±1/2,
since its just a question of reversing the order of the polynimial
coefficients. Passing in a negated limbcount n should *almost* work,
but it will break when handling the first or last coefficient which
may be of smaller size.

One reasonably simple way might be to use a common evaluation function
for the full-size coefficients, and then handle the final small
coefficient separately. (In my tree, I don't have any toom variants
that uses the pair ±1/2, so I haven't thought very deeply about this).

/Niels

PS. Speaking of combination functions available only on some
platforms, mpn_add_n_sub_n code seems to not be well tested, the
toom52 in the tree contains the following

#if HAVE_NATIVE_mpn_add_n_sub_n
  if (mpn_cmp (a0a2, a1a3, n+1) < 0)
    {
      mpn_add_n_sub_n (as2, asm2, a1a3, a0a2, n+1);
      flags ^= toom6_vm1_neg;
    }
  else
    {
      mpn_add_n_sub_n (as2, asm2, a0a2, a1a3, n+1);
    }
#else
  mpn_add_n (as2, a0a2, a1a3, n+1);
  if (mpn_cmp (a0a2, a1a3, n+1) < 0)
    {
      mpn_sub_n (asm2, a1a3, a0a2, n+1);
      flags ^= toom6_vm2_neg;
    }
  else
    {
      mpn_sub_n (asm2, a0a2, a1a3, n+1);
    }
#endif

which toggles the wrong flag bit in case HAVE_NATIVE_mpn_add_n_sub_n
is true.