Random number generator in mpz_millerrabin
Fri, 13 Dec 2002 01:18:40 +0100
Gary Wong wrote:
> there is a call to gmp_randinit_default (i.e. the Mersenne Twister init
> function), which is then used for only a handful of calls to =
> before the state is discarded again. On my system, this means that a
> call to mpz_millerrabin() takes over 30ms, whereas 16 repititions on a
> 128-bit prime take only 1ms if a weaker 128-bit LCG is used instead.
The problem lies completely within Mersenne Twister's seeding function, =
currently performs a powering modulo 2^19937-20023. The powering function=
been improved as of Dec 07 to avoid division, but it is still slow for =
purposes including this. There are two approaches to the problem: one is =
modify mpz_millerrabin and the other is to modify MT's seeding function. =
both would get best results, of course.
Modifying the API call of mpz_millerrabin is directly discarded. Using a =
variable for the random context would break reentrancy, which is not =
either. It's possible to implement a new function accepting a random =
one of its arguments, which was already discussed some time ago, but =
very good as a solution to this concrete problem as I see it.
One possibility is to initialize the random context inside the function =
known buffer. There are two drawbacks: one is that the numbers generated =
the function will be almost the same in each call, and the other is that
mpz_millerrabin should know about the structure of MT's context. The =
not too big a trouble since that's already happening; the latter can be =
if a new MT initialization function is implemented that accepts a buffer =
argument. This is the approach I personally like the most, but not =
that mpz_millerrabin is making the primality test always with basically =
numbers (bases?). This can't be solved (except by breaking reentrancy) =
just the current functions; a new one is needed.
Another `quick and dirty' possibility is to make mpz_millerrabin use
gmp_randinit_lc_2exp instead of gmp_randinit_default.
Of course, the MT seeding function speed should be improved if possible, =
how? The current initialization function has two properties which are =
desirable in this generator:
1. The mixing of values in the buffer make it unlikely to get two =
which seem correlated if they use different seeds. Correlation among =
obtained with different seeds was a problem in the previous versions of =
the homepage <http://www.math.keio.ac.jp/~matumoto/emt.html> for details.
2. Every value between 0 and 2^19937-20028 inclusive is a valid seed, and=
produces a different sequence. Since the state space of the generator is
2^19937-1 (all combinations of 19937 bits except all-zeros are valid), =
means that almost all (exactly all but 20026) possible sequences can be
These goals are obtained by calculating seed^1074888996 mod =
(with some further adjustments to ensure that the result won't be zero) =
using the result in chunks of 32 bits each to fill the buffer. But that =
operation is too costly in terms of speed.
Initialization with a lower seed space (say seeds in range 0 to 2^54-1) =
possible and perhaps desirable; some good and fast 32-bit PRNGs can be =
drawback is that only a subset of the possible buffer values would then =
available, and it would be harder to give guarantees that no two =
produce the same sequence.
Based on this, my suggestion is to apply all of the following:
- Reduce the seed space in MT to, say, 2^54 seeds (the number is a =
of a concrete algorithm I have in mind able to handle 54 bits for the =
That's a big downgrading of the current seed space and is for this reason=
most arguable point.
- Implement a new gmp_randinit_mt_buff (suggested name) function that =
MT buffer as a parameter.
- Replace the gmp_randinit_default in mpz_millerrabin with a call to
gmp_randinit_mt_buff with a pre-defined buffer. Of course if the default
algorithm is ever changed this call should probably be replaced again.
There's another possibility. As mpz_millerrabin is not calling =
MT initialization function could initialize its state with a predefined =
instead of calling the seeding function internally. In this way the new
gmp_randinit_mt_buff function is avoided and mpz_millerrabin can keep =
gmp_randinit_default which is even better.
Of course I'm open to comments, suggestions, improvements etc.