GMP 4.4 remaining tasks

Tue Dec 8 08:11:07 CET 2009

nisse at lysator.liu.se (Niels Möller) writes:

  Torbjorn Granlund <tg at gmplib.org> writes:

  > The wrapping multiplication for mpn_invert and mpn_binvert are of the
  > kind 2n x n.  Using mulmod_bnm1 for that is not that great.

  But not so bad, either, is it?

I suppose we could write a special mulmod_bnm1 for the degenerate case
where 2bn < rn.

  The multiplication is 2n (input) x n (previous iterate). Product is of
  size 3n, where n limbs at the most significant end (for mpn_invert) are
  known a priori to be all ones (or, since we don't quite converge from
  below due to truncation at this step in the algorithm, they can be
  either B^n or B^n-1, so one needs an extra limb to figure out which one
  it is). Hence one can use mulmod_bnm1 with rn = 2n + 1, an = 2n, bn = n.

I'd suggest to use also an=2n+1 (whether we use some mulmod or plain
mul).

  In the FFT range, wraparound arithmetic should save roughly 30% of the
  work compared to a full multiplication, but it's not clear to me what
  saving one can expect for smaller sizes.

Here "FFT range" refers to something like 1.5n < MUL_FFT_MODF_THRESHOLD,
I suppose.  MUL_FFT_MODF_THRESHOLD is vastly smaller than
MUL_FFT_THRESHOLD.  http://gmplib.org/devel/MUL_FFT_MODF_THRESHOLD.html

-- 
Torbjörn