New mpn_divexact code

Thu Aug 17 16:38:40 CEST 2006

Paul.Zimmermann at loria.fr (Paul Zimmermann) writes:

  > How much time does Jebelean's method need in terms of M(n),
  > compared to the traditional Newton inversion?

  If D(qn) is the time needed to divide exactly the least qn words,
  Jebelean's method needs 2*D(qn/2). This should save a factor of 2
  for the basecase, and asymptotically nothing (but the memory usage
  will decrease).

Saving memory is nice too, of course, in particular for huge
operands.

For small operands, we could also try a 2-adic Burnikel-Ziegler.
Niels Möller made some experiements, and he estimated the speedup to
20%.

Of course, Jebelean's trick and 2-adic Burnikel-Ziegler are not
mutually exclusive.

  > My Newton inversion code could surely be improved without involving
  > Jebelean's method.  There are comments in the file (see
  > http://swox.com/gmp/devel/mpn/generic/invert_2exp.c) from you ("Paul
  > Zimmermann spoke"), and comments by me about using short
  > multiplication (mpn_sqrhigh_mn, mpn_mulmid_mn).

  Yes we have to try this. Maybe it will save nothing.

I've now implemented the reduced inversion size algorithm (which I
sometimes refer to as "MUA") combined with the wrap-around FFT trick.
The code is available from the GMP Tidbits page.

I've made extensive experimentation with different inverse sizes, but
thus far not found one method for choosing it that is always best.

For choosing it optimally, one should probably peek ahead on the FFT
sizes that will be needed, and how close they are to the operand
sizes.  I'll postpone this work until we've reached optimality in the
FFT code, though.  :-)

-- 
Torbjörn