New mulmod_bknp1

Marco Bodrato bodrato at mail.dm.unipi.it
Tue Feb 22 22:39:52 CET 2022


Ciao,

Il Mar, 22 Febbraio 2022 8:04 pm, Niels Möller ha scritto:
> Marco Bodrato <bodrato at mail.dm.unipi.it> writes:
>> Simply, if a multiplication mod B^{3n}+1 is needed, the code computes
>>  - a product mod B^{n}+1
>>  - a product mod B^{2n}-B^{n}+1
>>  - with CRT, the desired result is obtained.

> Ok, and can the second product be computed more efficiently than full
> product + mod operation?

No. At least, I don't know; and it's currently implemented as a product+mod.

But, if M(n) (the cost of a nxn product) is more than linear in n, then
it's a good idea to split the computation into M(a)+M(b) where a+b=n.

>> This is generalised for multiplications mod B^{kn}+1 for k in

> Intuitively, I'd expect the gain is largest for k = 3 (since for larger
> k, we get even further away from splitting in half). Is that right?

Yes, indeed!

> It seems a bit tricky to take this into account in
> mpn_mulmod_bnm1_next_size, but maybe it's doable?
>
> E.g, in this case we could try a top-level B^66 - 1 product, split in
> B^33+1 and B^33-1; then the former suits your new algorithm well, but
> the former can't be recursively split (at least not on a B boundary). If

I fully agree.

MULMOD_BNM1_THRESHOLD is usually small, and the expected gain on the bnm1
side is larger than on the bnp1 side. We should never trade a 2 factor
with a 3 factor (or 8 with 9...).

But, assume MULMOD_BNM1_THRESHOLD > 15, should we prefer the size 14*32 or
the size 15*32? Maybe the latter!

> And we also have mpn_fft_next_size, but I'm not familiar with how that
> works.

I believe that code is quite simple, it chooses the first multiple of 2^k,
taking k from the "the best k for a given range" table.

But question "how does the new code interacts with the code that generate
the 'best k' table?" seems more complex.

Ĝis,
m



More information about the gmp-devel mailing list