Side-channel silent division

Niels Möller nisse at
Wed Nov 14 20:45:55 CET 2012

Torbjorn Granlund <tg at> writes:

> Do you mean to compute precisely the above value, not some very close
> limb+bit approximation (looking at, say the two most significant limbs
> of D)?

I was thinking exactly. At least, it has to be precise enough so that R
= B^{n+1} - v D fits in n limbs. It will be a single-limb invert, a
mul_1, and a small number of adjustment steps. It's probably best to do
a 3/2 inverse first, and then there should be at most one adjustment to
make it a correct (n+1)/n inverse.

> Something is strange there.  It seems that
>     u_{n+1}*B^{n+1} + U' = u_{n+1} v D + u_{n+1} R + U' 
> makes more sense.

Right. Sorry for writing so sloppily.

> OK, I think I see what you're doing.  You're allowing some greater
> partial remainders, and will then (typically) generate a limb+bit
> quotient.

The way I see it, the partial remainder is computed independently of the
computation of the quotient limbs. And in each iteration, we get a
two-limb piece of the quotient, and those then need to be added together
with overlap.

> Are you thinking of replacing REDC with the above trick in, say, powm?

I'd expect redc to be *sligthly* faster, at least for general D (e.g,
special code for the special case that the highest limb of D is
GMP_LIMB_MAX might well beat redc).

> I am mainly aiming at the a * B^n mod m operation for converting to
> Montgomery residues, and of CRT.

I think the typical application will do that several times, with the
same m. So it's invariance at application level, not necessarily within
any single mpn function. (If we push in this direction, I guess we
should really design an "arithmetic mod m"-interface. Which is
mostly independent to side-channel silence).

Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.

More information about the gmp-devel mailing list