Mod operation

Fri Nov 27 09:17:51 CET 2009

nisse at lysator.liu.se (Niels Möller) writes:

> Cool. Then we have a third(!) candidate for improved redc (the other
> two being the "bipartite" reduction, and reduction based on 2^n-1).

One can also combine the idea with bipartite reduction... As before,
consider x0 + x1B + x2 B^2 + x3 B^3 which is to be reduced mod m0 + B
m1, where we don't particularly care if bits are eliminated to the
left or to the right, and where m-dependent values can be precomputed
for free.

First reduce x to size 3n, as

  y0 + y1B + y2 B = x0 + x1 B + x2 B + x3 * m3, with m3 = B^3 mod m

Next, do a bdiv, which for high invariance would be done by
computing q = m0^-1 y0 mod B, and then subtracting q*m (or
dc_bdiv_qr_n, of x0 + x1 B / m0 folowed by a n x n for q * m1).

Then x is reduced to size 2n, eliminating one quarter at each end, and
*without* using any truncating division. The cost is two 2n x n and
one n x n.

I'd expect that one could also cook up an algorithm which combines the
same x0 * m3 recuction with a 2^k-1 reduction for the second step.
I.e., an algorithm which takes x of size 4n and computes x / (B-1) mod
m of size 2n.

Regards,
/Niels