nisse at lysator.liu.se
Wed Nov 25 14:15:23 CET 2009
Paul Zimmermann <Paul.Zimmermann at loria.fr> writes:
> I guess you mean "you don't care about the quotient" here, as Torbjörn
> pointed out.
> yes it makes sense, in the Karatsuba range you get 5/3M(n) instead of 2M(n).
> But it requires a precomputation.
One curious detail: Assume we only precompute the inverse of m. Then
to compute B^3 mod m, the quotient is free (it's the first half of the
inverse), so this is a single 2n x n multiply. If we include the cost
for this, we get a total of three 2n x n and one n x n, which in
schoolbook range is 7 n^2, which still is slightly better than Barrett
division which takes 8 n^2. But it's worse than blockwise Barret which takes
6n^2, and then it's somewhat silly to use Barrett division in the
I also wonder if there's some clever way to compute the inverse and m3
simultaneously (they're quotient and remainder of B^3 / m). The
"obvious" way is newton iteration to get the quotient aka inverse, and
anothere multiplication to get the remainder, but maybe they can be
combined in the final newton step or something like that.
> > 2. Can it be adapted to 2-adic division (bdiv)? What's unclear to me
> > is what the precomputed value m3 corresponds to in a 2-adic
> > setting.
> yes, see Algorithm MontgomerySvoboda2 in  (page 79).
Cool. Then we have a third(!) candidate for improved redc (the other
two being the "bipartite" reduction, and reduction based on 2^n-1).
We can also conclude that there seems do be a place for div_r and
bdiv_r, at least in the case that the the modulo is invariant and
*maybe* in general for large enough numbers.
More information about the gmp-devel