tg at gmplib.org
Thu Apr 8 14:25:30 CEST 2010
nisse at lysator.liu.se (Niels Möller) writes:
Paul Zimmermann <Paul.Zimmermann at loria.fr> writes:
> one problem with that approach (that I've tried) is that if one needs to
> split the individual primes afterwards, it will add another cost.
How costly is it? Say we have computed ppp as a balanced multiplication
tree from the prime list, and at the top level we have ppp = A B with A
and B coprime. And we have computed g = gcd(N, ppp) > 0, and want to
split g in prime factors.
One, possibly naive, approach is to compute gcd(g, A). Then we can
either rule out the factors of A or B, or g splits, and we can then
continue using a known split of A or B or both. Somewhat like a binary
search over the prime list.
You're here describing an asymptotically good trial division interface.
My suggestion was intended to live in the O(log(n)|S|) domain, where S
is the set of primes.
Most input numbers will presumably be (from our perspective) randomly
distributed. That means that the probability that n is divisible by p
is close to 1/p (at least if n >> p).
We will normally terminate more quickly if we do not do a lot of work
for setting up something clever. Initial division (using mpn_mod_1_4)
of 3 * 5 * ... should therefore always be the first step. Checking the
remainder of that division for its prime factors is very very fast.
After that, or perhaps after some more such attempts with larger primes,
if n is large enough to not live in cache, we should start accumulating
multilimb clumps of primes, then invoke mpn_bdiv_qr. The remainders
returned from mpn_bdiv_qr will then, as Paul points out, need to be
treated recursively (using mpn_mod_1_4).
(I suppose a "vector mpn_mod_1_4" could be an alternative to what I
suggest; it would divide the same number n by several primes clumps P_1,
P_2, P_3... in parallel. That would not be hard to implement, and could
perhaps be the best way around the cache problem.)
More information about the gmp-devel