Shared toom evaluation functions

bodrato at mail.dm.unipi.it bodrato at mail.dm.unipi.it
Thu Nov 19 15:07:02 CET 2009

>> You wrote two versions: with HAVE_NATIVE_mpn_addlsh_n or without.
>> Only the third not yet written needs this (re)organization ;-)
>
> I'm thinking that one should share source code and structure for both

I tried the other way, look if you like the code:

--8<----8<----8<----8<----8<----8<----8<----8<----8<----8<----8<----8<--
diff -r d0751023327b mpn/generic/toom_eval_pm2.c
--- a/mpn/generic/toom_eval_pm2.c	Thu Nov 19 12:59:19 2009 +0100
+++ b/mpn/generic/toom_eval_pm2.c	Thu Nov 19 15:03:11 2009 +0100
@@ -32,7 +32,7 @@
mpn_toom_eval_pm2 (mp_ptr xp2, mp_ptr xm2, unsigned k,
mp_srcptr xp, mp_size_t n, mp_size_t hn, mp_ptr tp)
{
-  unsigned i;
+  int i;
int neg;
mp_limb_t cy;
@@ -46,6 +46,21 @@
/* The degree k is also the number of full-size coefficients, so
* that last coefficient, of size hn, starts at xp + k*n. */

+  if (k & 1) MP_PTR_SWAP(xp2, tp);
+  xp2[n] = mpn_addlsh2_n (xp2, xp + (k-2) * n, xp + k * n, hn);
+  if (hn != n)
+    xp2[n] = mpn_add_1 (xp2 + hn, xp + (k-2) * n + hn, n - hn, xp2[n]);
+  k--;
+  tp[n] = mpn_addlsh2_n (tp, xp + (k-2) * n, xp + k * n, n);
+  k-=3;
+  if (k & 1) MP_PTR_SWAP(xp2, tp);
+  for (i = k & ~1; i>=0; i-=2)
+    xp2[n] = (xp2[n] << 2) + mpn_addlsh2_n (xp2, xp + i * n, xp2, n);
+  for (i = k + (k & 1) - 1; i>0; i-=2)
+    tp[n] = (tp[n] << 2) + mpn_addlsh2_n (tp, xp + i * n, tp, n);
+  mpn_lshift (tp, tp, n+1, 1);
xp2[n] = mpn_addlsh_n (xp2, xp, xp + 2*n, n, 2);
for (i = 4; i < k; i += 2)
@@ -88,6 +103,7 @@
else
mpn_add (xp2, xp2, n+1, xm2, hn+1);

neg = (mpn_cmp (xp2, tp, n + 1) < 0);

--8<----8<----8<----8<----8<----8<----8<----8<----8<----8<----8<----8<--

and decide if you want to include it or prefer the structure you proposed.

Regards,
Marco

--
http://bodrato.it/