[PPL-devel] mpz_addmul_ui with ui = 1
Marc Glisse
marc.glisse at normalesup.org
Fri Feb 6 09:56:55 CET 2009
On Wed, 4 Feb 2009, Abramo Bagnara wrote:
> with that two checks to avoid calls to mpz_addmul and mpz_submul permits
> to avoid 110958265 calls to mpz_addmul (94.3%) and 50079219 calls to
> mpz_submul (88,6%).
Indeed those percentages are the interesting data. For such a distribution
it is obviously more efficient to inline the frequent trivial case.
> This show me that the inlining of some trivial tests can give good
> results when the test is true very often.
Ideally, this would be the compilers job. It would need:
- link time optimization, to have access to the gmp code when compiling
your program. Several compilers can do this, and maybe gcc-4.5
- profile feedback, to know about this distribution
- partial inlining or arbitrary de-inlining
The last point seems like the hardest to find in a compiler, and writing
the function as:
mpz_mul(...){
if(test for 0){trivial case}
else{call mpz_mul_nonzero}
}
mpz_mul_nonzero(...){current mpz_mul code, minus the check for 0}
may help the compiler a lot.
> 1) the GMP library does not check for any special values unless this is
> needed for correctness and the user code is responsible to add the
> trivial tests that increase efficiency for specific data set
>
> 2) the GMP library move the trivial checks in gmp.h to permit their
> inlining. Note that this approach does not conflict with 1: it would be
> feasible to have mpz_addmul defined in gmp.h and mpz_addmul_raw defined
> in GMP library, then the user code could call the version more suitable
> for its needs, the important part is that library version does not check
> for special values.
Indeed, with this approach, the compiler does not even need LTO or
profiling. However, this is making the code a bit heavier by forcing those
inlines for every operation, even for codes where the numbers are never 0.
Don't know how much that matters.
> 3) (current situation) the GMP library does some special value checks
> and user code may decide to inline the very same checks. This makes the
> 'else' path heavier than needed.
I believe that the check for 0 in mpz_mul actually falls in your first
category, it is necessary for correctness (however, mpz_mul also checks
whether one of the arguments fits in a single limb, which fits your
description better). And the else path is supposedly already heavy enough
that the penalty should not be too bad. Did you see a significant
performance gain using 2) instead of 3)?
(I am not a gmp developer, just another user, so feel free to ignore my
questions if you think the answers won't help convince the real
developers)
--
Marc Glisse
More information about the gmp-discuss
mailing list