# Shared toom evaluation functions

bodrato at mail.dm.unipi.it bodrato at mail.dm.unipi.it
Fri Oct 30 12:50:43 CET 2009

>> not handle the degree==3 case, so that you need a function "ad hoc".

> The theory was that for degree 3 (toom4N), inputs are still small

You are right, but I'm working on toomN4... where N>4 :-D
Even if the sometimes the inputs are small (TOOM8_THRS/8 <
(TOOM6_THRS-1)/4 on my Pentium-M), I prefer a general evaluation function.
I have blocks like the following:

/* $\pm1$ */
sign = mpn_toom_ev_pm1 (v2, v0, ap, p, n, s,    pp) ^
mpn_toom_ev_pm1 (v3, v1, bp, q, n, t,    pp);
TOOM6H_MUL_N_REC(pp, v0, v1, n + 1, wse); /* A(-1)*B(-1) */
TOOM6H_MUL_N_REC(r3, v2, v3, n + 1, wse); /* A(1)*B(1) */
toom_couple_handling(r3, 2 * n + 1, pp, sign, n, 0, 0);

Evaluating, computing the products, and doing "early recomposition". The
range of accepted values for variable "p" (resp. "q") determines the range
of possible unbalance.
Removing degree 3 form _ev_pmX, means removing toom84 and toom94 from
toom6h, or 12x4 and 13x4 from toom8h.

> I added testcases for it before I started hacking, and those tests

One question: the tests in tests/mpn/toom-shared.h do compare the result
of mpn_toomMN_mul with results from mpn_mul. This is exactly what I
usually do, but it is safe only if toomMN is NOT integrated in mpn_mul!

> there's no native implementation of add_n_sub_n for any machine...

Funny!
15      15     604
It is referenced in 15 files, it recently changed its name
http://gmplib.org:8000/gmp/rev/674bf2af029b , but it doesn't exist :-D

>> I called it abs_sub_add_n in my toom-tools.h, I wonder if it is possible
>> to write a macro, not to repeat the same writing again and again...

> Should it be a macro, inline function, or "real" function? The
> smallest function where it's useful is toom32.

I don't know... suggestions?
It's only a few lines of code (once the #if decided which lines are
needed), maybe a macro.

> Should abs_sub_n be a function or macro? It's useful already in toom22
> (and its use in matrix22_mul is for larger sizes).

mpn_cmp basically is a macro, isn't it? My implementation of abs_sub_n is
a small evolution of the same code. I vote for a macro.

Regards,
Marco

--
http://bodrato.it/