Optimizing Modulus (was Re: Would someone mind elaborating the
explanation in the manual?)
cic_3_b at yahoo.com
Mon Oct 27 16:54:56 CET 2003
Hmm, Interesting discussion.
I kinda want to diverge for a minute. I have just profiled my
must admit the results are quite startling. I believe the best effect
be carried across if I quote you the first couple lines of the
produced by gprof.
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
70.72 356.85 356.85 __gmpn_mod_1
19.14 453.45 96.60 main
4.46 475.98 22.53
1.63 484.19 8.21 __gmpz_sub_ui
1.45 491.53 7.34 __gmpz_fdiv_ui
In my searches through the GMP source I have narrowed the field of
calls to the following group of statements. The function that calls
__gmpn_mod_1 is mpz_fdiv_ui but essentially these statements work
to achieve their task. update_func is a pointer to either the
mpz_sub_ui depending on whether I am going up from teh bottom of the
or down from the top. From what I understand function pointers are
than having an if..then..else statement that would be evaluated every
an iteration occours.
remainder = mpz_fdiv_ui (up_counter, prime_product_3_29);
if(!((remainder % 3) &&
(remainder % 5) &&
(remainder % 7) &&
(remainder % 11) &&
(remainder % 13) &&
(remainder % 17) &&
(remainder % 19) &&
(remainder % 23) &&
(remainder % 29)) )
update_func (up_counter, up_counter, 2);
This code is in a loop that iterates 300,000+ times a second. There
code in the loop but essentially the earlier parts of the loop are
to stop access to the remainder of the loop when it can be guaranteed
the latter tests will fail, i.e. weed out useless values. At the
moment I am
at a loss for optimizations as I have already optimized this program
iterating at 1333+ times a second to the current lower bound of it's
potential at 300,000+. I should mention that I am working with a
numbers from 12 digits long to 310 digits, when working on the lower
of the range working upwards the speed jumps to almost 800,000 times
My question is looking at the results of the profiler can anyone
anything that might further speed up my program. As I have just
am more looking to speed it up as the numbers get bigger than I am
about the smaller numbers. What I am looking for is a faster way to
remainder of a big number and even then all I really care about is
is an even divide (remander == 0).
Sisyphus, I am reading the book you showed me however I have not come
anything that pertains to my current predicament. If you know of a
that would be helpful I would be ever so appreciative if you would
Thanks in Advance
----- Original Message -----
From: "Brian Hurt" <bhurt at spnz.org>
To: "Kevin Ryde" <user42 at zip.com.au>
Cc: "Sisyphus" <kalinabears at iinet.net.au>; <gmp-discuss at swox.com>
Sent: Monday, October 27, 2003 11:40 AM
Subject: Re: Would someone mind elaborating the explanation in the
> On Mon, 27 Oct 2003, Kevin Ryde wrote:
> > If you're only testing divisibility then for instance n%3==0
> > (n * 0xAAAAAAAB) <= 0x55555555
> > which is one multiply.
> In 32-bit. In 64-bit, if n = 3, then n * 0xAAAAAAAB becomes
> and the test fails. Although you can make you life somewhat
> only dealing with three cases: 1) 32-bit, where you test that (n *
> 0xAAAAAAAB) <= 0x5555555, 2) 64-bit, where you test that (n *
> 0xAAAAAAAAAAAAAAAB) <= 0x5555555555555555), and 3) everything else,
> you test that (n % 3) == 0.
> I'd keep case 3.
> You snipped too much here:
> > > Which isn't always faster than a divide instruction.
> > True, that should be checked, but almost everywhere one multiply
> > faster than a divide.
> In every architecture I know of, multiply is at least as fast as
> and generally faster. I'm not sure it's true that a divide is
> 2 multiplies, a shift, and a subtract (which is what I was
> gmp-discuss mailing list
> gmp-discuss at swox.com
Do you Yahoo!?
Exclusive Video Premiere - Britney Spears
More information about the gmp-discuss