New mulmod_bknp1
Niels Möller
nisse at lysator.liu.se
Tue Feb 22 20:04:39 CET 2022
Marco Bodrato <bodrato at mail.dm.unipi.it> writes:
> Nothing special...
> Simply, if a multiplication mod B^{3n}+1 is needed, the code computes
> - a product mod B^{n}+1
> - a product mod B^{2n}-B^{n}+1
> - with CRT, the desired result is obtained.
Ok, and can the second product be computed more efficiently than full
product + mod operation?
> This is generalised for multiplications mod B^{kn}+1 for k in
> {3,5,7,13,17}.
Intuitively, I'd expect the gain is largest for k = 3 (since for larger
k, we get even further away from splitting in half). Is that right?
> If a product mod B^64-1 is required to mulmod_bnm1, then the function
> computes two multiplications: mod B^32-1, and mod B^32+1.
> Computing a product mod B^33+1 is now faster than computing mod
> B^32+1...
> ...but the product mod B^33+1 is simply not useful, the calling
> function needs the result mod B^32+1!
It seems a bit tricky to take this into account in
mpn_mulmod_bnm1_next_size, but maybe it's doable?
E.g, in this case we could try a top-level B^66 - 1 product, split in
B^33+1 and B^33-1; then the former suits your new algorithm well, but
the former can't be recursively split (at least not on a B boundary). If
we go up to B^72-1, this would split as (B^36+1)(B^18+1)(B^9+1)(B^9-1),
where B^9-1 can't be easily split, but the other factors are of the
requied form B^{3n}+1.
And we also have mpn_fft_next_size, but I'm not familiar with how that
works.
Regards,
/Niels
--
Niels Möller. PGP key CB4962D070D77D7FCB8BA36271D8F1FF368C6677.
Internet email is subject to wholesale government surveillance.
More information about the gmp-devel
mailing list