15.2.6 Exact Remainder

If the exact division algorithm is done with a full subtraction at each stage and the dividend isn’t a multiple of the divisor, then low zero limbs are produced but with a remainder in the high limbs. For dividend a, divisor d, quotient q, and b = 2^mp_bits_per_limb, this remainder r is of the form

a = q*d + r*b^n

n represents the number of zero limbs produced by the subtractions, that being the number of limbs produced for q. r will be in the range 0<=r<d and can be viewed as a remainder, but one shifted up by a factor of b^n.

Carrying out full subtractions at each stage means the same number of cross products must be done as a normal division, but there’s still some single limb divisions saved. When d is a single limb some simplifications arise, providing good speedups on a number of processors.

The functions mpn_divexact_by3, mpn_modexact_1_odd and the internal mpn_redc_X functions differ subtly in how they return r, leading to some negations in the above formula, but all are essentially the same.

Clearly r is zero when a is a multiple of d, and this leads to divisibility or congruence tests which are potentially more efficient than a normal division.

The factor of b^n on r can be ignored in a GCD when d is odd, hence the use of mpn_modexact_1_odd by mpn_gcd_1 and mpz_kronecker_ui etc (see Greatest Common Divisor).

Montgomery’s REDC method for modular multiplications uses operands of the form of x*b^-n and y*b^-n and on calculating (x*b^-n)*(y*b^-n) uses the factor of b^n in the exact remainder to reach a product in the same form (x*y)*b^-n (see Modular Powering).

Notice that r generally gives no useful information about the ordinary remainder a mod d since b^n mod d could be anything. If however b^n ≡ 1 mod d, then r is the negative of the ordinary remainder. This occurs whenever d is a factor of b^n-1, as for example with 3 in mpn_divexact_by3. For a 32 or 64 bit limb other such factors include 5, 17 and 257, but no particular use has been found for this.