Three technical remarks about GMP functions

Wed Jun 8 00:26:30 CEST 2005

Hi, everybody!

In a recent (Mar 4,2005) message in the topic "Infinite loop in mpz_probab_prime_p", Torbjorn wrote: "It is an interesting problem to look for small divisors; what algorithm and what limits to use." Moreover, in 'old' 2002-messages the same problem received some attention with no conclusive solution (as I understood!). Well, since I'm involved actually with this problem --- namely, how many "TRIAL DIVISIONS" are recommended when looking for small prime factors of a number --- I would like to share here in these GMP discussions some results I'm obtaining, despite I'm using GNU Pascal instead C language (which seems more natural and appreciated by the members of this group). 

1st - I've created a base of primes going from 2 thru 2^32-5 (that is, primes that can be stored in 32-bits words). In order to assure correctness in the database, I used redundantly two subroutines: the UBASIC nxtprm and GMP mpz_nextprime. Both produced exactly the same results: 2^32-5 = 4,294,967,291 is the 203,280,220th prime. I've divided this long series of primes in sets of 2^16 = 65536 primes each, in a total of 3101 full sets and a final 3102th with only 53084 elements. Every set serves as a mini-database for every Stage of my algorithm. So, a significant product of this initial phase of my research was the creation of a list of 3102 entries containing the greatest prime of each set --- named LOS - LimitsOfStages. 

2nd - Next, I have created a table containing the number of occurrences of every GAP existing among all those primes --- named GBP - GapsBetweenPrimes. By noting that every GAP is even (excepting between the primes 2 and 3) and less than or equal to 2*168 (which occurs for the prime 3842611109, for example), I've opted to store the complete series of primes simply storing that GAPs as a single byte per prime. Unnecessary to say that I cogited of other schemes, but this one seems to me the most advantageous under various aspects for the subsequent uses in my algorithm. A similar table of maximal GAPs between consecutive primes <= 7.263*10^13 can be found at page 80 of Riesel's "Prime Numbers and Computer Methods of Factorization" (Birkhauser - 2nd edition - 1994 - Vol.126 of Progress in Mathematics). From it, one can note that the largest gap <= 512 is 500=2*255 for the prime 303371455741. 

Thus, I've created a big database containing 3102 records of 65536 bytes each, and each byte containing the half-gap between primes --- named SOP - SeriesOfPrimes. It permits one to find the KthPrime(K) (1<=K<=203,280,220) in a very fast manner: by simply reading the (K div 65536)th record and adding (K mod 65536) gaps (provided exceptions for K=1 and K=2). Moreover, it permits one to find the NextPrime(N) or the LastPrime(N) which follows or precedes any number N (1<=N<=2^32), in a very fast manner too: by simply checking LOS - LimitsOfStages, reading the respective record and searching that primes in it by adding gaps (provided exceptions for Stage's borders). 

Note: The interval of values for N in the NextPrime(N) and LastPrime(N) subroutines is to be enlarged to 2^64 with the implementation of new subroutines.

3rd - Today, I'm exactly testing the subroutine SPF(N) that determines the Smallest Prime Factor of a given number N when this factor belongs to the SOP - SeriesOfPrimes. It is known that a unique computation of the GCD(N,POP) --- that is, of the Greatest Common Divisor between a given number N and POP = the product of different primes --- can determine if N has some factor(s) in common with POP. So, ideally, if POP is equal to the product of all prime contained in SOP - SeriesOfPrimes, a single GCD could say if a certain N has or not some prime factor(s) in common with POP. Unfortunately, the available hardware cannot permit this marvelous feat, as yet! Then, I've used the strategy of "divide to conquer": the full POP was divided in 3102 subproducts of 65536=2^16 factors each. Recalling that each factor is less than 2^32, it is easy to calculate the maximum size of each subproduct. Thus, the unique GCD theoretically necessary has been divided in 3102 equivalent GCDs, in order to
 make the task feasible with the actual technology available for me. 

Note: Let G=GCD(N,POP). If G=1, then N has no factor in common with POP. If G=N, then N contains only factors included in POP. If 1<G<N, then G is a divisor of N --- however, G is not necessarily a prime factor!!! Assuming that POP is the product of 65536=2^16 primes, my algorithm requires a maximum of 32 extras GCDs in order to find the SPF(G) - the Smallest Prime Factor of G. But, I think that's is not the case to discuss it here in details, although essential and interesting it may be.

Then, it arises the pertinent question: WHY I'M DOING THESE DIGRESSIONS JUST HERE?

Firstly, because Torbjorn and others have shown interest in the problem of how many "TRIAL DIVISIONS" are recommended when looking for small prime factors of a number.

Secondly, because I need to use the GMP subroutine mpz_init_array considering the maximum sizes of numbers involved (primes and subproducts of primes) are perfectly known (but I dont know how to use it in GNU Pascal).

Thirdly, because I need to use the GMP subroutines mpz_inp_raw and mpz_out_raw in order to write and read those Products of Primes from / to external files. Actually, I'm recalculating such POPs from SOP directly with relative efficiency --- but I need to compare the required times for both alternatives: POP's recalculations or precomputed POP's recoveries.

Fourth, because this "TRIAL DIVISION" method is to be found at nowhere else (to the best of my knowledge) and may be a very interesting GMP facility (certainly in a modified and simplified form - for example, using only one set of the first 65536 primes - from 2 to 821641). Note that it is deterministic, exact and (I believe) extremely efficient, in spite of the actual limitations of the available technology (by the way, I 've read that Torbjorn often consider the possibility of future theoretical development when he implements an algorithm!!!). So, think of finding, by means of a unique GCD computation, if a given number N has some factor(s) in common with the product of the first 203,280,220 primes! If they are relatively primes and N<=2^64~1.8*10^19, then the primeness of N can be definitely asserted with only one stroke! If they aren't coprimes, one can find the SPF(N) (Smallest Prime Factor of N) with a maximum of 64 extras computations of GCDs, using my algorithm.

To conclude: I'll appreciate to hear comments about this subject and, eventually, to discuss more details on my work! 

Whatever happens, thanks for your kind consideration! 

P.S. - To Décio (a Brazilian as myself?)!!! Your comments has been very impressive! I agree with you in almost all of your messages. I recognize you knows enough about you are talking and I respect much your opinion. Apropos, the specialized literature, in general, recommends a preliminary "trial division" for small factors (say <10^3 or <10^6) as the starting step of a method of factorization. Besides, I'm not suggesting the remotion of the random part of Rabin-Miller test. By the way, my research requires the determination of ALL DIVISORS (not simply the prime factors) of the successive natural numbers. In fact, I'm looking for composites principally. In the past, I used an UBASIC program for numbers <=2^34, but now I need to go beyond, with greater efficiency. So, I'm analyzing the truth behind of all these recommendations, not merely through arguments but preferably checking the things "at work", because the prime distribution use to surprise the more experient researchers. Its
 worthy to note that this methodological principle is embedded in my formulation of the problem: How many "TRIAL DIVISIONS" are recommended when looking for small prime factors of a number. 

Eng. Raul Nunes 
NEST Newsy Echoes on Science and Technology 

NEST presents the latest headlines on more than 300 different subjects,  harvested from near 2500 news sources and, above all, updated nearly  every 15 minutes.  Standard Headlines emphasize the sci-tech matters, while Personal Headlines permits customized specifications according to user's preferences.  Enjoy it!  CLICK HERE http://geocities.yahoo.com.br/raulnunes/

---------------------------------
Do you Yahoo!?
 Yahoo! Mail - You care about security. So do we.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://gmplib.org/list-archives/gmp-discuss/attachments/20050608/43c1de5a/attachment.html