Reflections on mulmod_bnm1

Torbjorn Granlund tg at
Tue Dec 8 22:24:19 CET 2009

nisse at (Niels Möller) writes:

  It looked like a good idea at the time. But I agree it's a poor
  interface. mullo (at least some of the implementations) had the same
  issue, and in that case we chose an interface that doesn't require extra
  space at rp, and instead the function has to allocate whatever it needs.
I think mullo could perhaps allow itself to write all an+bn limbs, but
mulmod_bnm1 could be used in ways were such a thing would be very

  > Should we perhaps pass and use tp for the mpn_mul_n result,
  It makes sense to me to at add scratch parameter tp. One might want to
  specify that rp and tp may point to the same area; that will save
  storage for all callers that compute rp into a scratch area anyway.
Good idea.

I leave this to Marco, or until Marco has expressed his opinion.

I checked in a new mpn_mulmod_bnm1_next_size, as well as a new speed
measruments type so that one can plot the jagged direct mulmod_bnm1 and
the next_size rounded mulmod_bnm1:

./speed -Pbar -s1-2000 mpn_mulmod_bnm1 mpn_mulmod_bnm1_next

A gnuplot of bar shows that things can still be improved, in particular
in the FFT range.  But things are much better than before my change!

  > (There is a potential error in this algorithm, which is also present in
  > the current next function: The non-FFT rounding might bring an n smaller
  > than MUL_FFT_MODF_THRESHOLD into the FFT range.  This could then be
  > insufficiently rounded for FFT...)
  Is it easy to check if a given n works with FFT? Then one can just check
  for this, like
            || !valid_fft_size_p(n))
        	mpn_bc_mulmod_bnp1 (xp, ap1, bp1, n);
        else { ... do the fft thing ... }
  (then I guess this may also interact with the choice of the fft "k" in
  some interesting way).
I doubt this is a problem in practice, but we should keep it in mind.
We'll hit an ASSERT_ALWAYS if there is indeed a problem.


More information about the gmp-devel mailing list