# mul_fft, cleaning up some details of the code

Marco Bodrato bodrato at mail.dm.unipi.it
Tue Mar 8 09:29:07 CET 2022

```Ciao Paul,

Il 2022-03-07 10:28 Paul Zimmermann ha scritto:
>> Date: Sun, 06 Mar 2022 11:14:49 +0100
>> From: Marco Bodrato <bodrato at mail.dm.unipi.it>

>> Specifically I'd focus into a suspicious piece of code, shared by both
>> our current code and Vivien's.

>> 	if (cy >= 0)
>> 	  cy = mpn_add_1 (tmp, tmp, Kl, cy);
>> 	else
>> 	  cy = mpn_sub_1 (tmp, tmp, Kl, -cy);

> GMP 6.2.1) and a fix:

>         if (cy >= 0)
>           cy = mpn_add_1 (tmp, tmp, Kl, cy);
> +        /* cy is now the carry at tmp[Kl] */
>         else
> +        {
>           cy = mpn_sub_1 (tmp, tmp, Kl, -cy);
> +          /* cy is now the borrow at tmp[Kl] */
> +          if (cy)
> +            cy = mpn_add_1 (tmp, tmp, Kl, cy);
> +          /* cy is now the carry at tmp[Kl] */
> +        }

Your fix of course is correct, that's the full normalization for the
else branch, but for that branch only.

The patch I pushed yesterday introduce an initial

tmp[Kl] = 0;

and then uses INCR_U or DECR_U

if (cy >= 0)
MPN_INCR_U (tmp, Kl + 1, cy);
else
{
tmp[Kl] = 1;
MPN_DECR_U (tmp, Kl + 1, -cy - 1);
}

The resulting value in tmp may be almost any value with Kl limbs and an

Since you deeply know how this code works, I ask you one more question.
The last lines of the function mpn_fft_mul_2exp_modF (branch m < n)
contain:

/* now subtract cc and rd from r[m..n] */

r[n] = -mpn_sub_1 (r + m, r + m, n - m, cc);
r[n] -= mpn_sub_1 (r + m, r + m, n - m, rd);
if (r[n] & GMP_LIMB_HIGHBIT)
r[n] = mpn_add_1 (r, r, n, CNST_LIMB(1));

This code assumes that there can only be a single borrow. Is it correct?
I'm going to change them into the following:

r[n] = 2;
MPN_DECR_U (r + m, n - m + 1, cc);
MPN_DECR_U (r + m, n - m + 1, rd);
if (UNLIKELY ((r[n] -= 2) != 0))
{
mp_limb_t cy = -r[n];
/* cy should always be 1, but we did not check if the case
m=n-1, r[m]=0, cc+rd>GMP_NUMB_MAX+1, and then cy = 2,
is actually possible. */
r[n] = 0;
MPN_INCR_U (r, n + 1, cy);
}

If we really can assume that cc+rd <= GMP_NUMB_MAX+1, the code could be
simpler:

r[n] = 1;
MPN_DECR_U (r + m, n - m + 1, cc);
MPN_DECR_U (r + m, n - m + 1, rd);
if (UNLIKELY (r[n] == 0))
MPN_INCR_U (r, n + 1, 1);
else
r[n] = 0;

Ĝis,
m

--
http://bodrato.it/papers/
```