GMP 4.4 remaining tasks
Torbjorn Granlund
tg at gmplib.org
Tue Dec 8 08:11:07 CET 2009
nisse at lysator.liu.se (Niels Möller) writes:
Torbjorn Granlund <tg at gmplib.org> writes:
> The wrapping multiplication for mpn_invert and mpn_binvert are of the
> kind 2n x n. Using mulmod_bnm1 for that is not that great.
But not so bad, either, is it?
I suppose we could write a special mulmod_bnm1 for the degenerate case
where 2bn < rn.
The multiplication is 2n (input) x n (previous iterate). Product is of
size 3n, where n limbs at the most significant end (for mpn_invert) are
known a priori to be all ones (or, since we don't quite converge from
below due to truncation at this step in the algorithm, they can be
either B^n or B^n-1, so one needs an extra limb to figure out which one
it is). Hence one can use mulmod_bnm1 with rn = 2n + 1, an = 2n, bn = n.
I'd suggest to use also an=2n+1 (whether we use some mulmod or plain
mul).
In the FFT range, wraparound arithmetic should save roughly 30% of the
work compared to a full multiplication, but it's not clear to me what
saving one can expect for smaller sizes.
Here "FFT range" refers to something like 1.5n < MUL_FFT_MODF_THRESHOLD,
I suppose. MUL_FFT_MODF_THRESHOLD is vastly smaller than
MUL_FFT_THRESHOLD. http://gmplib.org/devel/MUL_FFT_MODF_THRESHOLD.html
--
Torbjörn
More information about the gmp-devel
mailing list