mul_fft, cleaning up some details of the code

Marco Bodrato bodrato at mail.dm.unipi.it
Tue Mar 8 09:29:07 CET 2022


Ciao Paul,

Il 2022-03-07 10:28 Paul Zimmermann ha scritto:
>> Date: Sun, 06 Mar 2022 11:14:49 +0100
>> From: Marco Bodrato <bodrato at mail.dm.unipi.it>

>> Specifically I'd focus into a suspicious piece of code, shared by both
>> our current code and Vivien's.

>> 	if (cy >= 0)
>> 	  cy = mpn_add_1 (tmp, tmp, Kl, cy);
>> 	else
>> 	  cy = mpn_sub_1 (tmp, tmp, Kl, -cy);

> (resp. added) at tmp[0]. Here are some comments added to the code (from
> GMP 6.2.1) and a fix:

>         if (cy >= 0)
>           cy = mpn_add_1 (tmp, tmp, Kl, cy);
> +        /* cy is now the carry at tmp[Kl] */
>         else
> +        {
>           cy = mpn_sub_1 (tmp, tmp, Kl, -cy);
> +          /* cy is now the borrow at tmp[Kl] */
> +          if (cy)
> +            cy = mpn_add_1 (tmp, tmp, Kl, cy);
> +          /* cy is now the carry at tmp[Kl] */
> +        }

Your fix of course is correct, that's the full normalization for the 
else branch, but for that branch only.

The patch I pushed yesterday introduce an initial

     tmp[Kl] = 0;

and then uses INCR_U or DECR_U

	if (cy >= 0)
	  MPN_INCR_U (tmp, Kl + 1, cy);
	else
	  {
	    tmp[Kl] = 1;
	    MPN_DECR_U (tmp, Kl + 1, -cy - 1);
	  }

The resulting value in tmp may be almost any value with Kl limbs and an 
additional high 0 or 1.



Since you deeply know how this code works, I ask you one more question.
The last lines of the function mpn_fft_mul_2exp_modF (branch m < n) 
contain:

   /* now subtract cc and rd from r[m..n] */

   r[n] = -mpn_sub_1 (r + m, r + m, n - m, cc);
   r[n] -= mpn_sub_1 (r + m, r + m, n - m, rd);
   if (r[n] & GMP_LIMB_HIGHBIT)
     r[n] = mpn_add_1 (r, r, n, CNST_LIMB(1));

This code assumes that there can only be a single borrow. Is it correct?
I'm going to change them into the following:

   r[n] = 2;
   MPN_DECR_U (r + m, n - m + 1, cc);
   MPN_DECR_U (r + m, n - m + 1, rd);
   if (UNLIKELY ((r[n] -= 2) != 0))
     {
       mp_limb_t cy = -r[n];
       /* cy should always be 1, but we did not check if the case
          m=n-1, r[m]=0, cc+rd>GMP_NUMB_MAX+1, and then cy = 2,
          is actually possible. */
       r[n] = 0;
       MPN_INCR_U (r, n + 1, cy);
     }

If we really can assume that cc+rd <= GMP_NUMB_MAX+1, the code could be 
simpler:

   r[n] = 1;
   MPN_DECR_U (r + m, n - m + 1, cc);
   MPN_DECR_U (r + m, n - m + 1, rd);
   if (UNLIKELY (r[n] == 0))
     MPN_INCR_U (r, n + 1, 1);
   else
     r[n] = 0;

Ĝis,
m

-- 
http://bodrato.it/papers/


More information about the gmp-devel mailing list