Shared toom evaluation functions

Fri Oct 30 10:20:21 CET 2009

bodrato at mail.dm.unipi.it writes:

> I looked at the patch, your code is cleaner than mine, but your _pm1 does
> not handle the degree==3 case, so that you need a function "ad hoc".

There's already a special function for that, toom_eval_dgr3_pm1
(checked in a couple of months ago).

The theory was that for degree 3 (toom4N), inputs are still small
enough that the cost of generality is noticable (whether that's really
true, I don't know), and for degree 1 and 2, we should even avoid
function call overhead in the evaluation.

> For higher degree, we will probably always need both ±2 and ±1/2. One can
> prefer a single function evaluating all the four values and possibly
> exploit the symmetry (the coefficient in the middle, if it exists, can be
> shifted only once).

Makes sense to me. And whether or not reusing the middle coefficient
is useful depends on the existence of addlsh_n and it's speed relative
to add and lshift.

> Probably toom52 itself was not very deeply tested...

I added testcases for it before I started hacking, and those tests
have been exercised on all the machines where Torbjörn makes nightly
builds. So if this bug wasn't detected there, I guess it's because
there's no native implementation of add_n_sub_n for any machine...

> The structure above is very common, and it is shared by all toom_eval_pmX,
> I called it abs_sub_add_n in my toom-tools.h, I wonder if it is possible
> to write a macro, not to repeat the same writing again and again...

Should it be a macro, inline function, or "real" function? The
smallest function where it's useful is toom32.

> Another macro possibly useful is the function abs_sub_n(C,A,B,n), it
> computes {C,n} <- |{A,n}-{B,n}| and returns the sign. This is useful
> everywhere in toom evaluations,

Is it still useful in toom, if we have abs_sub_add_n? I don't think we
ever evaluate in a negative point without also evaluating in the
positive point? Of course, it could be used as a subroutine to
abs_sub_add_n, in the case that add_n_sub_n is not used.

Should abs_sub_n be a function or macro? It's useful already in toom22
(and its use in matrix22_mul is for larger sizes).

Regards,
/Niels