mul_fft, cleaning up some details of the code
Marco Bodrato
bodrato at mail.dm.unipi.it
Tue Mar 8 09:29:07 CET 2022
Ciao Paul,
Il 2022-03-07 10:28 Paul Zimmermann ha scritto:
>> Date: Sun, 06 Mar 2022 11:14:49 +0100
>> From: Marco Bodrato <bodrato at mail.dm.unipi.it>
>> Specifically I'd focus into a suspicious piece of code, shared by both
>> our current code and Vivien's.
>> if (cy >= 0)
>> cy = mpn_add_1 (tmp, tmp, Kl, cy);
>> else
>> cy = mpn_sub_1 (tmp, tmp, Kl, -cy);
> (resp. added) at tmp[0]. Here are some comments added to the code (from
> GMP 6.2.1) and a fix:
> if (cy >= 0)
> cy = mpn_add_1 (tmp, tmp, Kl, cy);
> + /* cy is now the carry at tmp[Kl] */
> else
> + {
> cy = mpn_sub_1 (tmp, tmp, Kl, -cy);
> + /* cy is now the borrow at tmp[Kl] */
> + if (cy)
> + cy = mpn_add_1 (tmp, tmp, Kl, cy);
> + /* cy is now the carry at tmp[Kl] */
> + }
Your fix of course is correct, that's the full normalization for the
else branch, but for that branch only.
The patch I pushed yesterday introduce an initial
tmp[Kl] = 0;
and then uses INCR_U or DECR_U
if (cy >= 0)
MPN_INCR_U (tmp, Kl + 1, cy);
else
{
tmp[Kl] = 1;
MPN_DECR_U (tmp, Kl + 1, -cy - 1);
}
The resulting value in tmp may be almost any value with Kl limbs and an
additional high 0 or 1.
Since you deeply know how this code works, I ask you one more question.
The last lines of the function mpn_fft_mul_2exp_modF (branch m < n)
contain:
/* now subtract cc and rd from r[m..n] */
r[n] = -mpn_sub_1 (r + m, r + m, n - m, cc);
r[n] -= mpn_sub_1 (r + m, r + m, n - m, rd);
if (r[n] & GMP_LIMB_HIGHBIT)
r[n] = mpn_add_1 (r, r, n, CNST_LIMB(1));
This code assumes that there can only be a single borrow. Is it correct?
I'm going to change them into the following:
r[n] = 2;
MPN_DECR_U (r + m, n - m + 1, cc);
MPN_DECR_U (r + m, n - m + 1, rd);
if (UNLIKELY ((r[n] -= 2) != 0))
{
mp_limb_t cy = -r[n];
/* cy should always be 1, but we did not check if the case
m=n-1, r[m]=0, cc+rd>GMP_NUMB_MAX+1, and then cy = 2,
is actually possible. */
r[n] = 0;
MPN_INCR_U (r, n + 1, cy);
}
If we really can assume that cc+rd <= GMP_NUMB_MAX+1, the code could be
simpler:
r[n] = 1;
MPN_DECR_U (r + m, n - m + 1, cc);
MPN_DECR_U (r + m, n - m + 1, rd);
if (UNLIKELY (r[n] == 0))
MPN_INCR_U (r, n + 1, 1);
else
r[n] = 0;
Ĝis,
m
--
http://bodrato.it/papers/
More information about the gmp-devel
mailing list