New mpn_divexact code
tg at swox.com
Mon Jul 31 14:35:00 CEST 2006
I've put improved mpn_divexact code on the the tidbits page
The new code uses a trimmed inverse in all cases, whereas the old
code only did that when log(dividend) <= 2*log(divisor).
In Barrett's algorithm, one computes an inverse equal to the size
of the divisor (or equal to the quotient if that is smaller).
This new code typically computes a smaller inverse, except if the
dividend is many times larger than the divisor. In all cases, it
computes an inverse size that makes the quotient development
steps have similar sizes.
The code for determining the inverse size looks like this:
/* Compute an inverse size that is a nice partition of the quotient
between half the divisor size and the full divisor size. */
b = (qn - 1) / dn + 1; /* ceil(qn/dn), number of blocks */
in = (qn - 1) / b + 1; /* ceil(qn/b) = ceil(qn / ceil(qn/dn)) */
Determining the optimal inverse size is a tricky problem. It
depends on underlying algorithms such as FFT multiplication and
how the inverse is computed.
Determining the cutoff point between basecase divexact and the
sub-quadratic code is another difficult problem. It is actually
not a cutoff point, but a cutoff line, as both the dividend size
and the divisor size comes into play. (Currently it is suboptimally
implemented as a point.)
The next thing to do is implementing Jebelean's bidirectional
modular inversion algorithm. That should save a good deal of the
Feedback and comments are welcome.
More information about the gmp-devel