Trial-division interface

Wed Apr 7 14:36:24 CEST 2010

I've been thinking a little about what an mpz-level interface for trial
division should look like. One fairly natural interface would be
something like

  int
  mpz_small_factor_p(unsigned int *f, const mpz_t n, unsigned int b);

This function should return 1 if n contains any prime factor p with p <=
B (or < B, I'm not sure which is the most common convention for the
definition of "B-smooth").

If f is non-NULL and n contains such a factor, a non-trivial factor of n
is stored in *f.

Now, there are a couple of subtleties.

  * For a non-trivial factor f, should one require that f is prime, or
    that f is the smallest prime factor?

  * The implementation will use a compiled in list of primes and various
    useful related quantities. The size of this list implies a limit on
    how large b are supported. There must be some way for an applicaton
    to find out how large b it can use.

  * For complete factorization, one would need some way to iterate over all
    small prime factors of n.

  * For a large bound b, there should be a way to tell GMP precompute a
    larger prime table, and then use that for trialdivision. Something
    like

      gmp_make_primetable(gmp_prime_table_t primes, unsigned int b);

    where for small b, primes should bbe set to a reference to thesmall
    compiled in table, while for larger b, a new table should be
    allocated and filled out.

  * In case one wants to test primality of a small number n, one wants
    to set b = floor(sqrt(n)). In this case, it would be more convenient
    if one could pass in the square of b.

  * Efficient trial division typically lumps a couple of primes
    together, to compute n mod ppp where ppp is a single-limb product
    p_k p_{k+1} ... p_{k+j} and. Then for the final such procuct (say
    p_k <= b < p_{k+1}), there's O(1) extra cost of checking
    divisibility also by the other factors of ppp. And for typical
    applications, it's useful to do that and return any discovered
    factor. To allow this behaviour, the function would be specified to
    allow some rounding up of the supplied b.

    On the other hand, such rounding can cause trouble for small
    borderline cases. E.g., it's natural to use for the first "lump",
    ppp = 3*5*7*11*13*17*19*23 (32-bit number). Then for both n = 17
    (prime) and n = 21 (composite), we get gcd(ppp, n) = n. So it's not
    sufficient here to compute (ppp, n).

For my current application (Maurer's algorithm), what I would need is

  * a function to answer definitely the question "is n prime" (I put b =5D
    floor(sqrt(n))) for fairly small n, and

  * a function to test if n has small factors, before doing more
    expensive testing. Here it doesn't matter precisely which factors
    are tested for. The choice of b is a question of performance tuning,
    not correctness.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.