Side-channel silent division
Torbjorn Granlund
tg at gmplib.org
Sun Nov 11 23:24:49 CET 2012
I intend to add a few more side-channel functions, from the powm_sec we
have already. The goal is to allow side-channel silent arithmetic
crypto use of GMP, including CRT, plain inverses and modular inverses.
I have now hacked a reasonably fast schoolbook division function. It is
roughly 2x slower than plain schoolbook division, since the innerloop
uses two submul_1 calls per quotient limb.
In the way this is written, it should be very resillient to how the
inverse in dinv is computed. I have not hacked side-channel silent
invers code yet.
Any opinions on this approach? I realise that it would be possible to
quitly develop more than GMP_LIMB_BITS/2 quotient bit per submul_1, such
as perhaps GMP_LIMB_BITS-1. I haven't tried that.
mp_limb_t
mpn_sbpi1_div_qr_sec (mp_ptr qp,
mp_ptr np, mp_size_t nn,
mp_srcptr dp, mp_size_t dn,
mp_limb_t dinv)
{
mp_limb_t qh, nh, cy, q1h, q0h, dummy, h;
mp_size_t i;
mp_ptr np_org, hp, qlp, qhp;
TMP_DECL;
ASSERT (dn > 2);
ASSERT (nn > dn);
ASSERT ((dp[dn-1] & GMP_NUMB_HIGHBIT) != 0);
TMP_MARK;
/* Create a divisor copy shifted half a limb. */
hp = TMP_ALLOC_LIMBS (dn + 1);
cy = mpn_lshift (hp, dp, dn, GMP_LIMB_BITS / 2);
hp[dn] = dp[dn - 1] >> GMP_LIMB_BITS / 2;
qlp = TMP_ALLOC_LIMBS (nn - dn + 1);
qhp = TMP_ALLOC_LIMBS (nn - dn + 1);
np_org = np;
np += nn;
qh = mpn_sub_n (np - dn, np - dn, dp, dn) >= 0;
mpn_addcnd_n (np - dn, np - dn, dp, dn, qh);
np--;
nh = 0;
/* Main loop. Develop one full limb per iteration, but do it in two steps in
order to avoid conditionals. Quotient bits will be either correct or
underestimates. When a quotient is underestimated, the next quotient will
compensate, since quotients are to be added at consecutive weight distance
GMP_LIMB_BITS/2. We make two quotient arrays, about GMP_LIMB_BITS/2
each. The arrays are added late after the loop. Separate arrays avoids
data-dependent carry propagation. */
for (i = nn - dn - 1; i >= 0; i--)
{
nh = (nh << GMP_LIMB_BITS/2) + (np[0] >> GMP_LIMB_BITS/2);
umul_ppmm (q1h, dummy, nh, dinv);
q1h += nh;
cy = mpn_submul_1 (np - dn, hp, dn + 1, q1h);
nh = np[0];
umul_ppmm (q0h, dummy, nh, dinv);
q0h += nh;
cy = mpn_submul_1 (np - dn, dp, dn, q0h);
nh -= cy;
np[0] = nh;
np--;
qlp[i] = q0h;
qhp[i] = q1h;
}
/* 1st adjustment depends on extra high remainder limb. */
h = np_org[dn]; ASSERT_ALWAYS (h <= 1);
qlp[0] += h;
h -= mpn_subcnd_n (np_org, np_org, dp, dn, h);
/* 2ns adjustment depends on remainder/divisor comparision as well as whether
extra remainder was nullified by previous subtract. */
cy = mpn_sub_n (np_org, np_org, dp, dn);
cy = cy == h;
mpn_addcnd_n (np_org, np_org, dp, dn, 1 - cy);
qlp[0] += cy;
/* Combine quotient halves into final quotient. */
mpn_lshift (qhp, qhp, nn - dn, GMP_LIMB_BITS/2);
mpn_add_n (qp, qhp, qlp, nn - dn);
TMP_FREE;
return qh;
}
--
Torbjörn
More information about the gmp-devel
mailing list