# Fast constant-time gcd computation and modular inversion

Niels Möller nisse at lysator.liu.se
Tue May 24 17:46:15 CEST 2022

nisse at lysator.liu.se (Niels Möller) writes:

> The new algorithm in the above paper uses a small auxillary state
> variable d (or \delta in the paper) to decide the update in the case
> both a, b are odd. If I write the iteration step using the same notation
> as above, it's
>
>   if a even:
>     a <-- a / 2, d <-- 1 + d
>
>   else if d <= 0
>     a <-- (a + b) / 2, d <-- 1 + d
>
>   else // d > 0
>     [a, b] <-- [(b - a)/2, a], d <-- 1 - d
>
> So conditions depend only on the sign of the auxillary d, and the least
> significant bit of a.

[...]

> In which way the decision logic based on sign of d guarantees progress
> is not clear to me (I haven't tried to read all the proofs), but I guess
> d should be understood as an approximation of bitsize(b) - bitsize(a)

The role of d is explained by the proof of G.1 (page 55 in the paper).
It can be understood as just a shift count.

To reduce the risk that I mess up, let's use the paper's notation. Then
the iteration operates on (d, f, g), f always odd.

if g even:
g <-- g / 2, d <-- 1 + d

else if d <= 0:
g <-- (g + f) / 2, d <-- 1 + d

else // d > 0
[f, g] <-- [g, (g - f)/2], d <-- 1 - d

In the paper's notation, say f is odd, g is even, g = 2^e g' with g odd,
e >= 1, and we start the algorithm with

(d_0, f_0, g_0) = (1, f, g/2)

In the first e-1 steps, we take out the powers of 2 from g, and get

(d_{e-1}, f_{e-1}, g_{e-1}) = (e, f, g')

Then we get to the both odd state and d > 0, so we get

(d_e, f_e, g_e) = (1-e, g', (g' - f) / 2)

Now, d_e <= 0, and it remains so for the next e-1 steps. These steps, we
keep reducing the (g' - f) term, adding g' whenever we have an odd value.
After a total of 2e steps, we're back at d = 1, and

(d_{2e}, f_{2e}, g_{2e}) = (1, g', (q g' - f)/2^e)

where q is an odd number in the range {1, 3, 5, ..., 2^{k+1} - 1).

So result after these 2e steps is that we're back at the initial value d
= 1. And the update of f and g is very similar to the 2-adic division
step in the Stehlé-Zimmermann algorithm (book-keeping of the powers of
two slightly different, there's a sign change, and quotient interval
isn't symmetric around zero).

By decomposing the 2-adic division into these 2e steps, the steps can be
implemented in a constant time fashion. And one gets a lot of flexibility
on how to split off the least significant part to work on.

Bounds on worst case (which determines the iteration count when using it
for constant-time inverse) is based on computer aided proofs. The paper
also claims that this choice of quotient interval rather than the
symmetric interval {1 - 2^e, 3 - e^1, ..., -1, 1, ... 2^e-1} of
Stehlé-Zimmermann gives a better worst-case bound, and that worst-case
analysis in that paper isn't quite right.

Regards,
/Niels

--
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.