Use on small mulmod_bnm1 [was: New mulmod_bknp1]
Marco Bodrato
bodrato at mail.dm.unipi.it
Sun May 1 23:09:48 CEST 2022
Ciao Niels,
for some reason I did not receive your answer. I took the text from the
archive.
Il gio, 21 aprile 2022 08:15 am, Niels Möller ha scritto:
> Marco Bodrato <bodrato at mail.dm.unipi.it> writes:
>> Yes, with a larger expected gain for 3, and a smaller one for 13, or 17.
>
> And in your table, almost all use 3, and none use 7, 13 or 17.
>> size -> measured time
>> 1080 -> 6.906e-05 (+ 72, +12%) 2^3*3^3*5
>> 1104 -> 7.294e-05 (+ 24, +5.6%) 2^4*3*23
>
> So 1104 wins over the power of 2, 1024 = 2^10.
Yes, it does. Both on shell and on my laptop, even if in both cases
MUL_FFT_MODF_THRESHOLD is smaller than 512.
>> 1584 -> 0.0001159 (+ 72, +4.2%) 2^4*3^2*11
>> 1600 -> 0.0001217 (+ 16, +5.1%) 2^6*5^2
>
> The only example in the list using 5 as a factor.
There are some more in the full list...
>> 1984 -> 0.0001648 (+ 40, +3.1%) 2^6*31
>
> First example not using any of the odd numbers in the list, so not using
> mulmod_bknp1 at all.
Yes, starting from here, the power of 2 seems dominant.
>> 2048 -> 0.0001676 (+ 64, +1.7%) 2^11
>
> I wonder if this would be beaten by 2064 = 2^4*3*43, or if this power
> of two really is a winner.
It is a winner, I measured a much larger range.
> It's also interesting that all these winners use 2^k with k >= 3, so
> third split in mulmod_bnm1 seems to pay off measurably.
On that machine, in that range.
> So just rounding up to a multiple of 24 = 2^3 * 3 might be a reasonable
> initial strategy (above some threshold of a likely few hundred limbs).
The reasonable gaps between the sizes probably are: +2, +4, +8, +12, +24,
+48, then powers of two.
On shell, something like:
+2 up to 36
+4 up to 108
+12 up to 264
+24 up to 936
and up to 1944... +24 again? Even if we have many +72 (preserving a factor
3^2)...
Then, powers of 2.
Ĝis,
m
More information about the gmp-devel
mailing list