error / memory handling

Fri Dec 26 13:58:13 UTC 2014

On Wed, 17 Dec 2014, Torbjörn Granlund wrote:

> I am not too fond of these global pointers.  It would be better design
> to refer the memory handling and error handling functions from each GMP
> user variable, akin to object oriented languages' vtables.  Except that
> this would make these small structures 3 times larger.

C++ containers have types like std::vector<type,allocator> so the 
allocator is part of the type, and in many cases that avoids storing 
anything at runtime. Obviously it makes it harder to keep precompiled code 
in a library, and doesn't apply to people still coding in C.

> signed long _mp_size     : 46;
> unsigned long _mp_alloc  : 14;  // 8-bit mantissa, 6-bit exponent
> unsigned long _mp_handler:  4;
> mp_limb_t _mp_d;        // 64

Note that it is also possible to store extra data in _mp_d[-1]. IIRC
mpfr does that, and std::string in libstdc++ does it as well.

On Wed, 17 Dec 2014, Niels Möller wrote:

> tg at gmplib.org (Torbjörn Granlund) writes:
>
>> The error handler function would then use the knowledge of the
>> allocation functions to clean up the memory state, and probably longjump
>> to to deallocate stack.
>
> We've discussed such hacks for memory allocation before (but now I can't
> find the relevant posts...). To recap the idea:
>
> The "protected" code should take some input mpz variables and some
> output mpz_variables. The input variables are read only. All computation
> takes place in newly allocated temporaries, not touching the output
> variables. Only after the computation is complete and successful,
> results are assigned to the output variables using mpz_swap (which can
> never result in an allocation failure).

It is true that this is a common way to get exception safety. It is also 
what a naive C++11 implementation using rvalue references would give 
(c=a+b first creates a number for a+b then swaps it with c). Note that 
this would completely kill performance in some applications, see below. 
Most of the perf improvements I made to gmpxx consisted in doing more 
operations in place, to avoid a new allocation. With a bit more care, it 
should be possible to make operations safe without forcing an extra 
allocation. Take mpz_add, I believe it is already safe, it first calls 
realloc on the output (it is the user's responsibility to have a realloc 
function that preserves memory when there is not enough space, if he wants 
that). mpz_mul would require more reshuffling.

> Then needs a wrapper around the protected code, first adding a
> checkpoint in the allocation data structures, and a setjmp buffer. At
> any allocation failure during the computation, the allocation function
> should deallocate all live storage allocated after the checkpoint, and
> longjmp out, and the wrapper can then return some failure indication or
> raise a C++ exception or whatever. The same mechanism could clean up
> allocation after other types of failures.

A drawback of using setjmp/lonjmp is that they have a cost. C++ exceptions 
on the other hand are designed to be almost free as long as nothing 
throws. I would want to see benchmarks (on extremely small numbers, one or 
two limbs) showing that it isn't too bad.

> Hmm. Maybe the key is to distinguish temporary allocations from more
> long-lived objects. That's the real point of the interface rules for the
> "protected" code above does: It ensures that *all* allocations are for
> temporaries).

If you consider that all allocations are for temporaries, you are also 
giving a way to extend the life of a temporary, which makes temporary 
memory not a stack anymore, so I believe you would still want 2 separate 
allocation interfaces for real temporaries and for temporaries that will 
become long-lived.

On Fri, 19 Dec 2014, Niels Möller wrote:

> The real problem is to do clean up temporary storage.

For mpn, the itch/scratch interface is a nice way to push the issue to the 
user ;-) By the way, we could probably add __GMP_NOTHROW to more mpn 
functions.

> Assuming the memory alllocation functions throw exception on errors,
> typical code catching that exception, cleaning up, and returning an
> error indication, would be something like
>
>  bool
>  foo (mpz_t out, const mpz_t in)
>  {
>    mpz_t tmp;
>    memory_pool_push ();
>    try {
>      mpz_init (tmp);
>      compute_something (tmp, in);
>      mpz_set_parent (out, tmp);
>      memory_pool_pop ();
>      return true;
>    }
>    catch (...) {
>      memory_pool_pop ();
>      return false;
>    }
>  }

Side note: in practice, memory_pool_push and memory_pool_pop are in the 
constructor and destructor of some object, so you don't have to repeat 
memory_pool_pop on each path. And since in many cases you actually want to 
propagate the exception, it also avoids the need for try/catch there.

> And with some care, I think this could be package reasonably nicely also
> with plain C, with some macro which both pushes a new memory pool and
> pushes a new jmpbuf for the exception handler. Some of this has the
> flavour of a general exception handling system for C, but that's
> unlikely to ever fly. So I think we should aim for internal gmp use, and
> application functions doing gmp calls only, no other allocations or
> anything else needing cleanup.

This is all well thought out, but somehow reinventing C++ exceptions in C 
doesn't seem worth it to me...

On Wed, 24 Dec 2014, Niels Möller wrote:

> I agree a special stack allocator could have a place here (in
> particular, if for some reason running with a very small execution
> stack). And it could be done internally in gmp.

TMP_ALLOC is already a stack allocator, no?

> But also note that small allocations are typically done on the execution
> stack, with alloca, and that for larger allocations, the computational
> cost of computing the limbs stored there is vastly larger than the
> allocation overhead, making the latter negligible.
>
>> One of the difficulties with GMP is that the limb arrays associated
>> with mpz_t (and therefore mpq_t) objects *never* decrease in
>> size, unless the user explicitly calls mpz_realloc2().
>
> In case requiring explicit calls to mpz_realloc is difficult for real
> applications with "intermediate expression swell", we should consider
> doing that automatically in gmp, e.g., whenever size < alloc / 3 or so.

I have written applications where I was relying on this "expression 
swell". With small numbers (just a few limbs), allocation dominates the 
running time. In this case, I would keep reusing the same few mpq_t (there 
were never that many in use at the same time) and rely on the fact that 
after a few uses, reallocations became very rare (the total number of 
reallocations in the application is actually bounded). I didn't need to 
know how many mpq_t I would need nor how big they might be, just that the 
product of those 2 numbers was bounded by some amount of memory I could 
afford.

I agree this is a special case, but getting deallocations when the numbers 
are small, or worse reallocating all the time with the swap trick, would 
kill this application. I am sure I would eventually find other workarounds 
though...

Since we are talking about allocations:

One thing I wanted to do at some point but that was complicated with the 
current allocation interface:
struct mpz_struct_with_buffer : __mpz_struct { mp_limb_t buffer[10]; };
with an appropriate constructor that would by default have _mp_d point to 
buffer. mpz_struct_with_buffer* is implicitly convertible to mpz_ptr so I 
can use the gmp functions just fine, but realloc doesn't know where the 
mpz_t is, so it cannot portably check if the pointer is to the local 
buffer or if it can call free on it. So I had to waste one limb in the 
buffer and every malloced region to store a bool saying if it came from 
malloc or from an internal buffer.

Another thing was using alloca for small temporaries in gmpxx. When 
someone computes a+b+c, the result of a+b is a temporary (sometimes I can 
transform the operation to reuse the output variable, but not always), but 
it is quite hard to use stack allocation there. The only ways I could 
think of were:
a) precompute in gmpxx an upper bound on the size a number may need and 
preallocate it appropriately with alloca (if small enough) before calling 
the GMP functions;
b) using a separate stack (not the main execution stack) for those 
allocations so I can control when to pop, changing the gmp allocation 
functions around that operation. But that's a bit complicated, has some 
overhead, and may not be a great fit for some uses. This is a special case 
of Niels' pool idea.

-- 
Marc Glisse