Running a program under a profiler is a good way to find where it’s spending most time and where improvements can be best sought. The profiling choices for a GMP build are as follows.
The default is to add nothing special for profiling.
It should be possible to just compile the mainline of a program with -p
and use prof
to get a profile consisting of timer-based sampling of
the program counter. Most of the GMP assembly code has the necessary symbol
information.
This approach has the advantage of minimizing interference with normal program operation, but on most systems the resolution of the sampling is quite low (10 milliseconds for instance), requiring long runs to get accurate information.
Build with support for the system prof
, which means ‘-p’ added
to the ‘CFLAGS’.
This provides call counting in addition to program counter sampling, which allows the most frequently called routines to be identified, and an average time spent in each routine to be determined.
The x86 assembly code has support for this option, but on other processors the assembly routines will be as if compiled without ‘-p’ and therefore won’t appear in the call counts.
On some systems, such as GNU/Linux, ‘-p’ in fact means ‘-pg’ and in this case ‘--enable-profiling=gprof’ described below should be used instead.
Build with support for gprof
, which means ‘-pg’ added to the
‘CFLAGS’.
This provides call graph construction in addition to call counting and program
counter sampling, which makes it possible to count calls coming from different
locations. For example the number of calls to mpn_mul
from
mpz_mul
versus the number from mpf_mul
. The program counter
sampling is still flat though, so only a total time in mpn_mul
would be
accumulated, not a separate amount for each call site.
The x86 assembly code has support for this option, but on other processors the assembly routines will be as if compiled without ‘-pg’ and therefore not be included in the call counts.
On x86 and m68k systems ‘-pg’ and ‘-fomit-frame-pointer’ are incompatible, so the latter is omitted from the default flags in that case, which might result in poorer code generation.
Incidentally, it should be possible to use the gprof
program with a
plain ‘--enable-profiling=prof’ build. But in that case only the
‘gprof -p’ flat profile and call counts can be expected to be valid, not
the ‘gprof -q’ call graph.
Build with the GCC option ‘-finstrument-functions’ added to the ‘CFLAGS’ (see Options for Code Generation in Using the GNU Compiler Collection (GCC)).
This inserts special instrumenting calls at the start and end of each function, allowing exact timing and full call graph construction.
This instrumenting is not normally a standard system feature and will require support from an external library, such as
This should be included in ‘LIBS’ during the GMP configure so that test programs will link. For example,
./configure --enable-profiling=instrument LIBS=-lfc
On a GNU system the C library provides dummy instrumenting functions, so programs compiled with this option will link. In this case it’s only necessary to ensure the correct library is added when linking an application.
The x86 assembly code supports this option, but on other processors the assembly routines will be as if compiled without ‘-finstrument-functions’ meaning time spent in them will effectively be attributed to their caller.