Use on small mulmod_bnm1 [was: New mulmod_bknp1]

Marco Bodrato bodrato at mail.dm.unipi.it
Wed Apr 20 22:54:53 CEST 2022


Ciao,

Il 2022-04-19 21:03 nisse at lysator.liu.se ha scritto:
> Marco Bodrato <bodrato at mail.dm.unipi.it> writes:
> 
>> In the 128-2048 range (at least on that machine: shell.gmplib.org) the
>> sizes multiple of 12, 24, 48... should be preferred...
> 
> I don't fully understand this, but if I got it right, we want the size 
> n
> to be divisible by 2^k, for mulmod_bnm1 to be able to split a few 
> times.
> But we don't need a very large k, since we have diminishing returns for
> each split.
> 
> And for the new mulmod_bknp1 to fit, we also need n to be divisible by
> one of certain small odd numbers, currently 3, 5, 7, 13, 17.

Yes, with a larger expected gain for 3, and a smaller one for 13, or 17.

> I think it would make sense to first select k (maybe a constant, or
> growing very slowly with n, which might ask for a tuned table), say 2 
> <=
> k <= 5 for the size range of interest. And then round n upwards to the
> closest multiple of one of 2^k * {3, 5, 7, 13, 17}. Hmm, or maybe to
> make it more complex, one of 2^{k,k-1} * {3, 5, 7, 13, 17}, it that
> let's us avoid growth. It would be nice if we could find a set of
> candidates that guarantees that we don't have to increase size more
> than, say, 10%, but not sure if that's possible.

It should be possible to not increase too much.

The following is a list of the best sizes wrt time spent in mulmod_bnm1.
Extracted in the range 1024..2048.
I'm not sure the program I wrote really shows what is needed.
Nevertheless, if the list contains 1008 and 1080, this means that for 
every size in the range (1080..1080) the time for a multiplication using 
that size is larger than the time spent by the last point in the 
interval.

  size -> measured time
  1008 -> 6.156e-05	(+  72, +8.3%) 2^4*3^2*7
  1080 -> 6.906e-05	(+  72, +12%) 2^3*3^3*5
  1104 -> 7.294e-05	(+  24, +5.6%) 2^4*3*23
  1128 -> 7.686e-05	(+  24, +5.4%) 2^3*3*47
  1200 -> 7.986e-05	(+  72, +3.9%) 2^4*3*5^2
  1224 -> 8.28e-05	(+  24, +3.7%) 2^3*3^2*17
  1296 -> 8.602e-05	(+  72, +3.9%) 2^4*3^4
  1320 -> 9.437e-05	(+  24, +9.7%) 2^3*3*5*11
  1368 -> 9.824e-05	(+  48, +4.1%) 2^3*3^2*19
  1392 -> 0.0001022	(+  24, + 4%) 2^4*3*29
  1416 -> 0.0001087	(+  24, +6.4%) 2^3*3*59
  1512 -> 0.0001112	(+  96, +2.3%) 2^3*3^3*7
  1584 -> 0.0001159	(+  72, +4.2%) 2^4*3^2*11
  1600 -> 0.0001217	(+  16, +5.1%) 2^6*5^2
  1680 -> 0.0001273	(+  80, +4.6%) 2^4*3*5*7
  1704 -> 0.0001396	(+  24, +9.7%) 2^3*3*71
  1728 -> 0.00014	(+  24, +0.23%) 2^6*3^3
  1776 -> 0.0001434	(+  48, +2.4%) 2^4*3*37
  1800 -> 0.0001439	(+  24, +0.35%) 2^3*3^2*5^2
  1872 -> 0.0001463	(+  72, +1.7%) 2^4*3^2*13
  1920 -> 0.000158	(+  48, + 8%) 2^7*3*5
  1944 -> 0.0001598	(+  24, +1.2%) 2^3*3^5
  1984 -> 0.0001648	(+  40, +3.1%) 2^6*31
  2048 -> 0.0001676	(+  64, +1.7%) 2^11

Ĝis,
m


More information about the gmp-devel mailing list