Shared toom evaluation functions
Niels Möller
nisse at lysator.liu.se
Thu Oct 29 16:54:26 CET 2009
bodrato at mail.dm.unipi.it writes:
> My mpn_toom_ev_lsh can be used for your mpn_toom_eval_pm2, but it can
> evaluate also _pm4, or _pm8 as needed by higer degree Toom.
Makes sense! I thought _pm2 only needed mpn_addlsh1_n (or falls back
to separate shift and add), but it actually uses the more general
mpn_addlsh_n. (Don't know which platforms actually have these
functions).
It would be nice to ba able to use the same function for ±2 and ±1/2,
since its just a question of reversing the order of the polynimial
coefficients. Passing in a negated limbcount n should *almost* work,
but it will break when handling the first or last coefficient which
may be of smaller size.
One reasonably simple way might be to use a common evaluation function
for the full-size coefficients, and then handle the final small
coefficient separately. (In my tree, I don't have any toom variants
that uses the pair ±1/2, so I haven't thought very deeply about this).
/Niels
PS. Speaking of combination functions available only on some
platforms, mpn_add_n_sub_n code seems to not be well tested, the
toom52 in the tree contains the following
#if HAVE_NATIVE_mpn_add_n_sub_n
if (mpn_cmp (a0a2, a1a3, n+1) < 0)
{
mpn_add_n_sub_n (as2, asm2, a1a3, a0a2, n+1);
flags ^= toom6_vm1_neg;
}
else
{
mpn_add_n_sub_n (as2, asm2, a0a2, a1a3, n+1);
}
#else
mpn_add_n (as2, a0a2, a1a3, n+1);
if (mpn_cmp (a0a2, a1a3, n+1) < 0)
{
mpn_sub_n (asm2, a1a3, a0a2, n+1);
flags ^= toom6_vm2_neg;
}
else
{
mpn_sub_n (asm2, a0a2, a1a3, n+1);
}
#endif
which toggles the wrong flag bit in case HAVE_NATIVE_mpn_add_n_sub_n
is true.
More information about the gmp-devel
mailing list