longjmp()

Fri Mar 7 22:44:46 UTC 2014

Your example safe_mpz_mul() is definitely NOT what I want to do:

1. Checking for allocation failure on every operation is an
   unacceptably heavy burden.

2. Establishing (and popping) the setjmp() point on every operation
   would produce a huge performance hit, whereas out-of-memory is
   usually quite a rare event.

3. Every operation forces at least one allocate/free operation for
   every output operand.  This would tend to kill CPU cache
   locality, or at least significantly expand the working set.

Here is a more detailed account of my scenario:

I have a setjmp() at some relatively high position in the call
stack (a major algorithm's top-level function).  The next several
stack frames are part of my algorithm implementation that uses
GMP extensively throughout.  The next several stack frames are
GMP (and GMP internal) functions.  The last set of stack frames
are held by my custom memory allocation function, and at this
point it has determined that we are out of memory and it is
time to bail by calling longjmp() to my high-level setjmp().

Cleaning up all of this memory is the responsibility of my
application, not GMP.  Most of this cleanup needs to be done
before longjmp() is called, otherwise all record of what was
allocated at the various levels is lost.  I can structure my own
application in a way that allows this.  The problem is that GMP
does not presently seem able to divulge to my application all of
its pending TMP_DECL, TMP_MARK, and TMP_FREE context blocks.  There
is no stack-ordered chain of these contexts.  I would need to chase
such a chain back up the stack, and free every block that was
too big for alloca() -- until I either run out of chain, or
get to a stack level that is higher up than my setjmp() level.
(This would only happen if a signal handler interrupts GMP and
my handler invokes this big algorithm that uses GMP again.)

Note that "access" to such a chain, plus the ability to annotate
those TMP blocks that GMP is using to contain pointers to "other"
allocated blocks -- are also necessary for doing a conservative
garbage collector's "mark" phase.  I can certainly arrange my
own application so that all of my GMP objects are traceable,
but unless GMP also provides such traceability of its internal
memory blocks, this isn't going to work.

So I guess my question is this: *if* GMP could be (configurably)
modified to maintain this chain of TMP contexts, and my
application uses it correctly, are there any *other* obstacles
that you can see to using longjmp()/setjmp() as a well-defined
method for aborting and cleaning up a large GMP application in
this way?

One easy thing for a garbage collector to do is to simply
traverse every mpz_t, mpq_t, and mpf_t object and "right size"
the current limb array.  This wouldn't be safe on objects GMP
is currently manipulating (unless GMP is re-written to assume
that the address of *every* limb array can change upon calls
to alloc/realloc), so either this re-write would be necessary
or GMP would have to "mark" the objects it is currently
accessing / manipulating so that the GC does not move them.
Note: moving limb arrays around (without marking those you
do not want moved) will produce an implementation that is
inherently "signal unsafe".

The system I am contemplating could actually be encapsulated
as a (potentially open source) library that wraps around GMP.
When using this library, all GMP objects (mpz_t, mpq_t, mpf_t)
would have to be dynamically allocated as homogeneous vectors
of GMP numbers, with this vector held by a (either stack or
heap allocated) wrapper object.  The wrapper object tags the
type (mpz_t, mpq_t, mpf_t) and size of the vector, and provides
GC tracing access.  Given a wrapper/vector of mpz_t objects, a
macro can return, e.g., an mpz_ptr to the i-th element of the
vector, for direct manipulation with GMP operations.  What the
library provides would be:

  1. the garbage collector(s),
  2. a TMP_DECL, TMP_MARK, TMP_FREE sort of mechanism for use
     by clients of the library,
  3. the wrappers within which all GMP objects live,
  4. facilities for establishing (push/pop) the setjmp() exit
     contexts,
  5. facilities for grouping/classifying each wrapper, setting
     "quiescent" versus "active" status and/or object lifetime
     hint info for groups/wrappers, and
  6. all other stuff needed for full thread-safe and/or
     signal-safe operation.

This would be fairly easy to use -- so long as (1) the vectors
of GMP numbers never get moved, only their referenced limb
buffers, (2) lots of assertions in the code to detect incorrect
manipulation of the wrappers (e.g., TMP_MARK without matching
TMP_FREE before return).

On Thu, 2014-03-06 at 21:22 +0100, Niels Möller wrote:
> David Warme <David at Warme.net> writes:
> 
> > I am posting this to gmp-devel (instead of gmp-discuss) because it
> > concerns very arcane GMP implementation details.
> > Another possibility is that blocks allocated using the TMP_DECL,
> > TMP_MARK, and TMP_FREE mechanisms (if too large for alloca())
> > would not (automatically) be freed when calling longjmp().
> 
> Right, the tricky thing is to do the cleanup which deallocates precisely
> the right blocks. I think I've written about this once before, not too
> many months ago, but I can't find it now (possibly it was some
> guile-related discussion). I think the following would be a safe way to
> trap memory allocation from any gmp function. As an example, I consider
> only a single gmp function, mpz_mul:
> 
> int 
> safe_mpz_mul (mpz_t r, const mpz_t a, const mpz_t b)
> {
>   mpz_t t;
> 
>   ... set a mark in the memory allocation machinery ...
> 
>   if (setjmp (buf)) /* buf here is some global belonging to allocation
>                        machinery */
>     {
>       /* Error case */
>       ... deallocate everything allocated later than the above mark ...
>       /* This work could be done by the memory allocation failure code,
> 	 before calling longjmp. I put it here to illustrate which case
> 	 it belongs to. */ 
>       return FAIL;
>     }
>   else
>     { 
>       /* The normal case, which might longjmp out in the middle of
>          some operation due to allocation failure. */
>       mpz_init (t);
>       mpz_mul (t, a, b);
>     }
>   mpz_swap (r, t); /* Safe, because no allocation is involved. */
>   mpz_clear (t);   /* Deallocation, also safe. */
> 
>   return SUCCESS;
> }
> 
> The idea here is that in the code block "protected" by the setjmp, the
> mpz_t variables are partitioned into two sets:
> 
> 1. Variables which are allocated outside of the block. They can be used
>    *only* as read-only inputs to mpz functions. This guarantees that
>    they will not be reallocated, and they will therefore be untouched by
>    the deallocation cleanup done in the failure case. Note that this set
>    includes the ultimate result parameter, r. Hence the little dance
>    with mpz_swap before returning.
> 
> 2. Variables which are allocated (mpz_init) within that block. Those can
>    be modified using any mpz functions, and reallocated as many times as
>    are needed. In the failure case, they will be completely deallocated.
> 
> And the dealloaction on failure must deallocate not only the storage for
> variables in the second set, but also all temporary storage allocated
> before the allocation failure. Which should work fine, with an
> appropriate implementation of the "... set mark..." thing.
> 
> > Yet another could be the need to "pop" some sort of (thread local?)
> > activation stack of the TMP_MARK entries using
> > __gmp_tmp_reentrant_free().  GMP 5.1.3 does not seem to implement
> > any method for creating/maintaining such an activation stack of
> > TMP_MARK entries, however.
> 
> There are a couple of different implementations, and I don't know them
> all. I'm not aware of any pointer to the top of such a stack. If there
> is some pointer like that, it clearly has to be saved at setjmp and
> reset at longjmp.
> 
> And this *is* arcane, so if you try this scheme out, please let us know
> how it works out.
> 
> Regards,
> /Niels
>