Bitwise logic gmpxx.h problems
Marc Glisse
marc.glisse at normalesup.org
Mon Jul 28 19:02:11 CEST 2008
On Sun, 27 Jul 2008, Torbjorn Granlund wrote:
> Here are the new code I am considering for gmpxx.h in GMP 4.2.3.
> Comments?
What I write is really only comments, feel free to ignore any of it. I
started with a simple answer to your mail and ended up writing a rather
long rant...
> I will not improve the mpz_init/mpz_clear of other functions for
> 4.2.3, since it is not a release for speed improvements, just bug
> fixes.
I think one possible improvement (obviously not for 4.2.*) is to use more
in-place operations. For instance, unary compound operations are currently
implemented as:
__gmp_expr<T, T> temp(expr.val);
Op::eval(p, temp.__get_mp());
I believe it could be safely replaced by:
expr.val.eval(p);
Op::eval(p,p);
which avoids the copy. This gives the compiler a chance to turn "x=-(-x)"
into a nop instead of two copies.
The unary operator+ is also implemented as making a copy, whereas I guess
it could be implemented as the identity (although nobody ever uses it so
it does not matter).
Having more in-place operations would likely require an additional
argument to __gmp_expr::eval (and __gmp_set_expr) that for instance says
whether the return pointer is precious (may be used in expr) or simply a
new value that can be used as temp.
The way gmpxx works has two stages: building the __gmp_expr tree, and
evaluating it. The tree is nice because when we evaluate it we know there
is no aliasing taking place between the internal nodes (on the other hand,
it means that square(x+x) actually performs two additions). The aliasing
with the return value is the only one that is unknown, and that is one
reason I think an additional argument giving this information (maybe
adding at the same time a way to remember that we are doing a += and not
just any addition, though that can also be checked by comparing pointers
at runtime) would help. Some rewriting of the tree could be possible, at
creation time, for instance to replace -(-x) by x or (-x)*y by -(x*y),
although I am not sure this would give real improvements on not
stupidly-written code.
I read the doc about in-place operations, and it says that in-place
mpz_add isn't really useful except when an operand is small, but it often
avoids an allocation, so it looks useful. I read the code in mpz/aors.h,
and I find it surprising that the result is required to have one more limb
than the size of the largest operand even in the case of a substraction
(for the addition, obviously there can be a carry). Looks like we could
save a limb, which means a realloc in some cases.
It would also be interesting to make statistics about how often a typical
gmpxx program calls mpz_set(x,y) with x==y.
Begin most useless rant paragraphs:
I was surprised to see that the allocation policy choice in gmp is at the
level of the malloc/realloc/free functions. This means that if for
instance I want a policy like "never allocate less than two limbs" for
mpz_t, the best I can do is apparently to have the malloc replacement
allocate more than asked and the realloc replacement do nothing if the new
size asked is still smaller than already allocated. It would have been
nicer if I could have made the mpz_t aware of the size actually allocated
to it. Looks like one of the "bright ideas" in the tasks.
Also, the requirement that realloc and free can't be called on NULL in gmp
means we can't avoid the allocation of 1 limb even when we know it is
useless (otherwise it seems like a memset(p,0,sizeof(__mpz_struct)) would
do in many cases): looks like the lazy allocation thing mentionned in the
tasks.
End of most useless rant paragraphs :-)
> // Max allocations for plain types when converted to mpz_t
> // FIXME: how do we get the proper max "double" exponent?
I would say DBL_MAX_EXP (from <float.h>) in C and
std::numeric_limits<double>::max_exponent (from <limits>) in C++ if I
understand the question.
> #define __GMP_DBL_LIMBS (2 + (1025 - 1) / GMP_NUMB_BITS)
I had never realized double could only represent such small integers that
they can be allocated on the stack (I thought this trick would be used
only for signed/unsigned long, and for initialization of a mpf by a
double).
> #define __GMP_ULI_LIMBS (1 + (8 * sizeof (long) - 1) / GMP_NUMB_BITS)
>
> #define __GMPXX_LOGOP_UI(OP) \
The name of the macro does not reflect the fact that it is for mpz only
(not mpq, which could get a similar macro later).
> mpz_t temp; \
> mp_limb_t limbs[__GMP_ULI_LIMBS]; \
> temp->_mp_d = limbs; \
> temp->_mp_alloc = __GMP_ULI_LIMBS; \
> mpz_set_ui (temp, l); \
> mpz_##OP (z, w, temp)
I thought you could leave the OP part out of the macro, make the macro
only provide a temp for use by the calling code. If I want to reuse this
one for sub(unsigned,mpz) I need to call sub(mpz,unsigned) and then neg.
For division it does not work. Though if the permutation of the arguments
is the only issue it would be pretty easy to provide a second macro for
this.
The original macros I thought of were:
(leaving out the trailing \, they are inconvenient in the mail)
#define MPZ_INIT2_STACK(temp,N)
mp_limb_t limbs[N];
temp->_mp_d=limbs;
temp->_mp_alloc=N
#define MPZ_CLEAR_STACK(temp) /* do nothing */
with versions:
#define MPZ_INIT_SET_UI_STACK(temp,l)
MPZ_INIT2_STACK(temp,__GMP_ULI_LIMBS);
mpz_set_ui(temp,l)
modelled after the regular mpz_init2 and mpz_init_set_ui. I even left the
mpz_t declaration out of it so a #define MPZ_INIT2_STACK mpz_init2 (and
similarly for mpz_clear) would work.
I don't know what is best. Anyway I believe these macros are not
interfaces so they can be changed in the next release.
> #define __GMPXX_LOGOP_SI(OP) \
> mpz_t temp; \
> mp_limb_t limbs[__GMP_ULI_LIMBS]; \
> temp->_mp_d = limbs; \
> temp->_mp_alloc = __GMP_ULI_LIMBS; \
> mpz_set_si (temp, l); \
> mpz_##OP (z, w, temp)
> #define __GMPXX_LOGOP_D(OP) \
> mpz_t temp; \
> mp_limb_t limbs[__GMP_DBL_LIMBS]; \
> temp->_mp_d = limbs; \
> temp->_mp_alloc = __GMP_DBL_LIMBS; \
> mpz_set_d (temp, d); \
> mpz_##OP (z, w, temp)
>
> struct __gmp_binary_and
> {
> static void eval(mpz_ptr z, mpz_srcptr w, mpz_srcptr v)
> { mpz_and(z, w, v); }
>
> static void eval(mpz_ptr z, mpz_srcptr w, unsigned long int l)
> { __GMPXX_LOGOP_UI(and); }
> static void eval(mpz_ptr z, unsigned long int l, mpz_srcptr w)
> { __GMPXX_LOGOP_UI(and); }
> static void eval(mpz_ptr z, mpz_srcptr w, signed long int l)
> { __GMPXX_LOGOP_SI(and); }
> static void eval(mpz_ptr z, signed long int l, mpz_srcptr w)
> { __GMPXX_LOGOP_SI(and); }
> static void eval(mpz_ptr z, mpz_srcptr w, double d)
> { __GMPXX_LOGOP_D(and); }
> static void eval(mpz_ptr z, double d, mpz_srcptr w)
> { __GMPXX_LOGOP_D(and); }
> };
>
> struct __gmp_binary_ior
> {
> static void eval(mpz_ptr z, mpz_srcptr w, mpz_srcptr v)
> { mpz_ior(z, w, v); }
> static void eval(mpz_ptr z, mpz_srcptr w, unsigned long int l)
> { __GMPXX_LOGOP_UI(ior); }
> static void eval(mpz_ptr z, unsigned long int l, mpz_srcptr w)
> { __GMPXX_LOGOP_UI(ior); }
> static void eval(mpz_ptr z, mpz_srcptr w, signed long int l)
> { __GMPXX_LOGOP_SI(ior); }
> static void eval(mpz_ptr z, signed long int l, mpz_srcptr w)
> { __GMPXX_LOGOP_SI(ior); }
> static void eval(mpz_ptr z, mpz_srcptr w, double d)
> { __GMPXX_LOGOP_D(ior); }
> static void eval(mpz_ptr z, double d, mpz_srcptr w)
> { __GMPXX_LOGOP_D(ior); }
> };
>
> struct __gmp_binary_xor
> {
> static void eval(mpz_ptr z, mpz_srcptr w, mpz_srcptr v)
> { mpz_xor(z, w, v); }
> static void eval(mpz_ptr z, mpz_srcptr w, unsigned long int l)
> { __GMPXX_LOGOP_UI(xor); }
> static void eval(mpz_ptr z, unsigned long int l, mpz_srcptr w)
> { __GMPXX_LOGOP_UI(xor); }
> static void eval(mpz_ptr z, mpz_srcptr w, signed long int l)
> { __GMPXX_LOGOP_SI(xor); }
> static void eval(mpz_ptr z, signed long int l, mpz_srcptr w)
> { __GMPXX_LOGOP_SI(xor); }
> static void eval(mpz_ptr z, mpz_srcptr w, double d)
> { __GMPXX_LOGOP_D(xor); }
> static void eval(mpz_ptr z, double d, mpz_srcptr w)
> { __GMPXX_LOGOP_D(xor); }
> };
Aren't you missing the macro change to make operators like &= work? I
think it was just a PP->P and ZZ->Z to make it mimic operator+=.
While I am writing an email, I might as well add a clean-up. In gmp.h,
there is:
#define __need_size_t /* tell gcc stddef.h we only want size_t */
#if defined (__cplusplus)
#include <cstddef> /* for size_t */
#else
#include <stddef.h> /* for size_t */
#endif
#undef __need_size_t
It should be changed to:
#if defined (__cplusplus)
#include <cstddef> /* for size_t */
#else
#define __need_size_t /* tell gcc stddef.h we only want size_t */
#include <stddef.h> /* for size_t */
#undef __need_size_t
#endif
The C++ headers are not ready for use with __need_*, as can be seen by
compiling a file that contains only these lines with g++, and the only
reason it did not cause problems is that iosfwd (included before in gmp.h)
already includes cstddef (without any __need_*) in the current g++
implementation.
--
Marc Glisse
More information about the gmp-bugs
mailing list