Use on small mulmod_bnm1 [was: New mulmod_bknp1]

Marco Bodrato bodrato at mail.dm.unipi.it
Sun May 1 23:09:48 CEST 2022


Ciao Niels,

for some reason I did not receive your answer. I took the text from the
archive.

Il gio, 21 aprile 2022 08:15 am, Niels Möller ha scritto:
> Marco Bodrato <bodrato at mail.dm.unipi.it> writes:
>> Yes, with a larger expected gain for 3, and a smaller one for 13, or 17.
>
> And in your table, almost all use 3, and none use 7, 13 or 17.

>>  size -> measured time
>>  1080 -> 6.906e-05	(+  72, +12%) 2^3*3^3*5
>>  1104 -> 7.294e-05	(+  24, +5.6%) 2^4*3*23
>
> So 1104 wins over the power of 2, 1024 = 2^10.

Yes, it does. Both on shell and on my laptop, even if in both cases
MUL_FFT_MODF_THRESHOLD is smaller than 512.

>>  1584 -> 0.0001159	(+  72, +4.2%) 2^4*3^2*11
>>  1600 -> 0.0001217	(+  16, +5.1%) 2^6*5^2
>
> The only example in the list using 5 as a factor.

There are some more in the full list...

>>  1984 -> 0.0001648	(+  40, +3.1%) 2^6*31
>
> First example not using any of the odd numbers in the list, so not using
> mulmod_bknp1 at all.

Yes, starting from here, the power of 2 seems dominant.

>>  2048 -> 0.0001676	(+  64, +1.7%) 2^11
>
> I wonder if this would be beaten by 2064 = 2^4*3*43, or if this power
> of two really is a winner.

It is a winner, I measured a much larger range.

> It's also interesting that all these winners use 2^k with k >= 3, so
> third split in mulmod_bnm1 seems to pay off measurably.

On that machine, in that range.

> So just rounding up to a multiple of 24 = 2^3 * 3 might be a reasonable
> initial strategy (above some threshold of a likely few hundred limbs).

The reasonable gaps between the sizes probably are: +2, +4, +8, +12, +24,
+48, then powers of two.

On shell, something like:
+2 up to 36
+4 up to 108
+12 up to 264
+24 up to 936
and up to 1944... +24 again? Even if we have many +72 (preserving a factor
3^2)...
Then, powers of 2.

Ĝis,
m



More information about the gmp-devel mailing list