3.13 Profiling

Running a program under a profiler is a good way to find where it’s spending most time and where improvements can be best sought. The profiling choices for a GMP build are as follows.

--disable-profiling

The default is to add nothing special for profiling.

It should be possible to just compile the mainline of a program with -p and use prof to get a profile consisting of timer-based sampling of the program counter. Most of the GMP assembly code has the necessary symbol information.

This approach has the advantage of minimizing interference with normal program operation, but on most systems the resolution of the sampling is quite low (10 milliseconds for instance), requiring long runs to get accurate information.

--enable-profiling=prof

Build with support for the system prof, which means ‘-p’ added to the ‘CFLAGS’.

This provides call counting in addition to program counter sampling, which allows the most frequently called routines to be identified, and an average time spent in each routine to be determined.

The x86 assembly code has support for this option, but on other processors the assembly routines will be as if compiled without ‘-p’ and therefore won’t appear in the call counts.

On some systems, such as GNU/Linux, ‘-p’ in fact means ‘-pg’ and in this case ‘--enable-profiling=gprof’ described below should be used instead.

--enable-profiling=gprof

Build with support for gprof, which means ‘-pg’ added to the ‘CFLAGS’.

This provides call graph construction in addition to call counting and program counter sampling, which makes it possible to count calls coming from different locations. For example the number of calls to mpn_mul from mpz_mul versus the number from mpf_mul. The program counter sampling is still flat though, so only a total time in mpn_mul would be accumulated, not a separate amount for each call site.

The x86 assembly code has support for this option, but on other processors the assembly routines will be as if compiled without ‘-pg’ and therefore not be included in the call counts.

On x86 and m68k systems ‘-pg’ and ‘-fomit-frame-pointer’ are incompatible, so the latter is omitted from the default flags in that case, which might result in poorer code generation.

Incidentally, it should be possible to use the gprof program with a plain ‘--enable-profiling=prof’ build. But in that case only the ‘gprof -p’ flat profile and call counts can be expected to be valid, not the ‘gprof -q’ call graph.

--enable-profiling=instrument

Build with the GCC option ‘-finstrument-functions’ added to the ‘CFLAGS’ (see Options for Code Generation in Using the GNU Compiler Collection (GCC)).

This inserts special instrumenting calls at the start and end of each function, allowing exact timing and full call graph construction.

This instrumenting is not normally a standard system feature and will require support from an external library, such as

This should be included in ‘LIBS’ during the GMP configure so that test programs will link. For example,

./configure --enable-profiling=instrument LIBS=-lfc

On a GNU system the C library provides dummy instrumenting functions, so programs compiled with this option will link. In this case it’s only necessary to ensure the correct library is added when linking an application.

The x86 assembly code supports this option, but on other processors the assembly routines will be as if compiled without ‘-finstrument-functions’ meaning time spent in them will effectively be attributed to their caller.