error handling

Wed Dec 17 13:49:39 UTC 2014

[Moved thread from gmp-devel.]

I'd like to think of error handling also in a C perspective, and
consider a few more problems at the same time.

Which sources of exceptions do we have currently?

1 We divide by 0 to generate a SIGFPE.  I think we do that for division
  by zero as well as some other mathematically undefined operation.

2 We detect overflow of _mp_size.

3 Allocation problems.

4 Stack overflow leading to SIGSEGV.  (I've recently trimmed the stack
  usage to stay within a few hundred KiB, but some threaded environment
  give tiny default stacks.)

5 ASSERT_ALWAYS

6 More that I have forgotten.

I saw the suggestion to invoke a user-defined function instead of the
explicitly detected errors (i.e., 1, 2, 3, 5, perhaps 6).  That is an
idea worth considering.  We have function pointers to for allocation,
reallocation, freeing; this would be one more such global pointer.

The error handler function would then use the knowledge of the
allocation functions to clean up the memory state, and probably longjump
to to deallocate stack.

GMP might be able to clean up TMP_* memory, since that is a kind of
stack.

I am not too fond of these global pointers.  It would be better design
to refer the memory handling and error handling functions from each GMP
user variable, akin to object oriented languages' vtables.  Except that
this would make these small structures 3 times larger.

There are real scenarios where one would want different sets of
allocation functions.  E.g., many libraries use GMP.  Some of them have
their orn error reporting mechanisms, and memory handles.  But then the
user might use GMP directly, or she might from the same program use
several libraries which use GMP.  The current GMP memory allocation
mechanism is not suitable here.

We have another problem with the GMP structures.  On 64-bit machines,
the 32-bit _mp_size field is starting to hurt for some applications, and
with that also the _mp_alloc field.

We might therefore consider changing these structures in an incompatible
way.  We culd then sneak in a field for choosing 'handler functions'.
Perhaps like this:

        signed long _mp_size     : 46;
        unsigned long _mp_alloc  : 14;  // 8-bit mantissa, 6-bit exponent
        unsigned long _mp_handler:  4;
        mp_limb_t _mp_d;        // 64

The trickery with field sizes keeps the structure at 128 bits.  There
will be a cost for it, though.

The _mp_handler field chooses one of 16 handler groups.  Each such group
will have function pointers for allocation, reallocation, freeing, and
error reporting.  I'd say that 16 groups will be sufficient for any use.

The _mp_alloc alloc field is a home-brew float.  Allocations <= 255 will
have have field values 1..255 which require no extra processing.

An alternative _mp_alloc trickery would define the current allocation in
terms of the _mp_size field, saying how much unused space there is.
Such a field wouldn't need to be very large, but then there would be
situation where GMP would need to realloce when an operand decreases a
lot in size.

To avoid any of this trickery, we could change the structure to be 192
bits.  That would cost a lot in cache load for applications which
e.g. use arrays of GMP numbers.  Like this:

        signed long _mp_size     : 64;
        unsigned long _mp_alloc  : 60;
        unsigned long _mp_handler:  4;
        mp_limb_t _mp_d;        // 64

(These sizes are hard to overflow, except for those with lots of time
and who can afford 8 Eibyte of memory.)

-- 
Torbjörn
Please encrypt, key id 0xC8601622