Reflections on mulmod_bnm1

Wed Dec 9 08:34:09 CET 2009

Ciao,

> I think mullo could perhaps allow itself to write all an+bn limbs, but

It doesn't currently. But it would simplify the function, particularly
with Mulder's trick, where a bigger product is computed.

>   It makes sense to me to at add scratch parameter tp. One might want to
>   specify that rp and tp may point to the same area; that will save
>   storage for all callers that compute rp into a scratch area anyway.
>
> Good idea.

I do not agree with Torbjorn. It's not a good idea, it's a Great Idea :-D
Maybe we can apply the same strategy to mullow as well?

>   Is it easy to check if a given n works with FFT? Then one can just check
>   for this, like

> I doubt this is a problem in practice, but we should keep it in mind.
> We'll hit an ASSERT_ALWAYS if there is indeed a problem.

This can be a problem in practice! I inserted a very naive code... and
forgot to ask if it is correct.

Currently mpn_mulmod_bnm1 asks for k = mpn_fft_best_k (n,0),
then adjusts the value k = min(k, count_trailing_zeros(n)),
and finally:
 if (k >= FFT_FIRST_K) mpn_mul_fft (r, n, a, an, b, bn, k);

Probably this is not optimal... but is it correct?
The ASSERT_ALWAYS (mpn_fft_next_size (n, k) == n) at the beginning of
mul_fft suggest it is, but please confirm.

If it is correct, then we have an easy way "to check if a given n works
with FFT": (n == mpn_fft_next_size (n, FFT_FIRST_K))

One more warning: in the mul_fft call above it may happen that an < bn.
A fast inspection of the code suggests that this is not a problem with
current implementation... I hope it will not be for any in the future!

Regard,
Marco

-- 
http://bodrato.it/papers/