Shared toom evaluation functions

Fri Oct 30 08:26:05 CET 2009

Ciao,

>> My mpn_toom_ev_lsh can be used for your mpn_toom_eval_pm2, but it can
>> evaluate also _pm4, or _pm8 as needed by higer degree Toom.
>
> Makes sense! I thought _pm2 only needed mpn_addlsh1_n (or falls back
> to separate shift and add), but it actually uses the more general

... and this is why addmul_by3 will not be used by _pm3, _by9 is needed.
I looked at the patch, your code is cleaner than mine, but your _pm1 does
not handle the degree==3 case, so that you need a function "ad hoc".

> It would be nice to ba able to use the same function for ±2 and ±1/2,
[...]
> but it will break when handling the first or last coefficient which
> may be of smaller size.

For higher degree, we will probably always need both ±2 and ±1/2. One can
prefer a single function evaluating all the four values and possibly
exploit the symmetry (the coefficient in the middle, if it exists, can be
shifted only once). Anyway the smaller coefficient will always be tricky.

> PS. Speaking of combination functions available only on some
> platforms, mpn_add_n_sub_n code seems to not be well tested, the
> toom52 in the tree contains the following

Probably toom52 itself was not very deeply tested... anyway your patch
improve the readability of Toom evaluation and should reduce this kind of
errors.

> #if HAVE_NATIVE_mpn_add_n_sub_n
>   if (mpn_cmp (a0a2, a1a3, n+1) < 0)
>     {
>       mpn_add_n_sub_n (as2, asm2, a1a3, a0a2, n+1);
>       flags ^= toom6_vm1_neg;
>     }
>   else
>     {
>       mpn_add_n_sub_n (as2, asm2, a0a2, a1a3, n+1);
>     }
> #else
>   mpn_add_n (as2, a0a2, a1a3, n+1);
>   if (mpn_cmp (a0a2, a1a3, n+1) < 0)
>     {
>       mpn_sub_n (asm2, a1a3, a0a2, n+1);
>       flags ^= toom6_vm2_neg;
>     }
>   else
>     {
>       mpn_sub_n (asm2, a0a2, a1a3, n+1);
>     }
> #endif

The structure above is very common, and it is shared by all toom_eval_pmX,
I called it abs_sub_add_n in my toom-tools.h, I wonder if it is possible
to write a macro, not to repeat the same writing again and again...
Another macro possibly useful is the function abs_sub_n(C,A,B,n), it
computes {C,n} <- |{A,n}-{B,n}| and returns the sign. This is useful
everywhere in toom evaluations, but we find it defined also in
matrix22_mul.c. My definition of it, in toom-tools.h is slightly different
(the implementation, not the result), can we unify them?

Regards,
Marco

-- 
http://bodrato.it/