mul_fft, cleaning up some details of the code

Tue Mar 8 12:56:02 CET 2022

       Hi Marco,

> Since you deeply know how this code works, I ask you one more question.
> The last lines of the function mpn_fft_mul_2exp_modF (branch m < n) 
> contain:
> 
>    /* now subtract cc and rd from r[m..n] */
> 
>    r[n] = -mpn_sub_1 (r + m, r + m, n - m, cc);
>    r[n] -= mpn_sub_1 (r + m, r + m, n - m, rd);
>    if (r[n] & GMP_LIMB_HIGHBIT)
>      r[n] = mpn_add_1 (r, r, n, CNST_LIMB(1));
> 
> This code assumes that there can only be a single borrow. Is it correct?
> I'm going to change them into the following:
> 
>    r[n] = 2;
>    MPN_DECR_U (r + m, n - m + 1, cc);
>    MPN_DECR_U (r + m, n - m + 1, rd);
>    if (UNLIKELY ((r[n] -= 2) != 0))
>      {
>        mp_limb_t cy = -r[n];
>        /* cy should always be 1, but we did not check if the case
>           m=n-1, r[m]=0, cc+rd>GMP_NUMB_MAX+1, and then cy = 2,
>           is actually possible. */
>        r[n] = 0;
>        MPN_INCR_U (r, n + 1, cy);
>      }
> 
> If we really can assume that cc+rd <= GMP_NUMB_MAX+1, the code could be 
> simpler:
> 
>    r[n] = 1;
>    MPN_DECR_U (r + m, n - m + 1, cc);
>    MPN_DECR_U (r + m, n - m + 1, rd);
>    if (UNLIKELY (r[n] == 0))
>      MPN_INCR_U (r, n + 1, 1);
>    else
>      r[n] = 0;

yes there can only be a single borrow, since since cc is 0 or 1,
if r[n] is non zero after r[n] = -mpn_sub_1 (r + m, r + m, n - m, cc),
this can only happen for cc=1 and {r+m,n-m} = 000...000 before the
mpn_sub_1 call, and {r+m,n-m} = 111...111 after, then the second
mpn_sub_1 call cannot produce a borrow.

Best regards,
Paul